|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Data needed on long-running suitesAll,
David and I are working on ways to shorten the validation phase of the inner programming loop (feature -> test -> code -> validate). One of the ideas is to find a general way to run suites faster by running tests in parallel. To find effective parallelization strategies, we need data on test run times. Rather than build big infrastructure to do this (which would undoubtedly be cool), I'd like to start with the simplest thing that could possibly work. So: If you have a long-running test suite and You run it using Ant and You can use the XML formattter ("<formatter type="xml"/>") and You don't mind sharing your test names with me confidentially Would you please zip your reports and email them to me. If they're too big for email, please let me know and we'll figure out a backup plan. I'd appreciate any context you can provide--how long the suite has been in development, the experience level of the developers, whatever else you think we might need to know. The first data set I looked at was from DevCreek-->90M test runs from production coding representing more than 50 person-years of development. To my surprise, the test runs exhibit a power law distribution (way lots of fast tests, a few very long running tests, plot a histogram log-log and you get a straight line). I have no idea what this means, but it brings to mind the Asimov quote, "The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'" I've found many power law distributions in the static and dynamic structure of code, but the mechanisms influencing test run times seem to be completely different than those influencing code structure. Anyway, I'd love to validate those findings. The Ant XML format seems like a good place to start. Alternatively, you could send me one or more files with test run times one per line. Questions and comments appreciated. Yours in science, Kent Beck Three Rivers Institute |
|
|
Re: Data needed on long-running suitesKent,
Did you ever get any results from this? I'm betting I'll have to write an anonymization tool to bundle up whatever Google data we can share--would it help if I offered it to the list? David On Mon, Sep 8, 2008 at 7:53 PM, kentb <kentb@...> wrote: > All, > > David and I are working on ways to shorten the validation phase of the inner > programming loop (feature -> test -> code -> validate). One of the ideas is > to find a general way to run suites faster by running tests in parallel. To > find effective parallelization strategies, we need data on test run times. > Rather than build big infrastructure to do this (which would undoubtedly be > cool), I'd like to start with the simplest thing that could possibly work. > So: > > If you have a long-running test suite and > You run it using Ant and > You can use the XML formattter ("<formatter type="xml"/>") and > You don't mind sharing your test names with me confidentially > > Would you please zip your reports and email them to me. If they're too big > for email, please let me know and we'll figure out a backup plan. I'd > appreciate any context you can provide--how long the suite has been in > development, the experience level of the developers, whatever else you think > we might need to know. > > The first data set I looked at was from DevCreek-->90M test runs from > production coding representing more than 50 person-years of development. To > my surprise, the test runs exhibit a power law distribution (way lots of > fast tests, a few very long running tests, plot a histogram log-log and you > get a straight line). I have no idea what this means, but it brings to mind > the Asimov quote, "The most exciting phrase to hear in science, the one that > heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny > ...'" I've found many power law distributions in the static and dynamic > structure of code, but the mechanisms influencing test run times seem to be > completely different than those influencing code structure. > > Anyway, I'd love to validate those findings. The Ant XML format seems like a > good place to start. Alternatively, you could send me one or more files with > test run times one per line. > > Questions and comments appreciated. > > Yours in science, > > Kent Beck > Three Rivers Institute > > |
|
|
RE: Data needed on long-running suitesI got one submission: data from 2000 Python tests. Interestingly, they
didn't show any clear trend in runtimes. I have also analyzed the data from Gump, and found a clear power-law ish distribution of runtimes. I'm still open for more data, or suggestions of how to make the submission process simpler. Cheers, Kent Beck Three Rivers Institute _____ From: junit@... [mailto:junit@...] On Behalf Of David Saff Sent: Monday, September 15, 2008 9:50 AM To: junit@... Subject: Re: [junit] Data needed on long-running suites Kent, Did you ever get any results from this? I'm betting I'll have to write an anonymization tool to bundle up whatever Google data we can share--would it help if I offered it to the list? David On Mon, Sep 8, 2008 at 7:53 PM, kentb <kentb@earthlink. <mailto:kentb%40earthlink.net> net> wrote: > All, > > David and I are working on ways to shorten the validation phase of the inner > programming loop (feature -> test -> code -> validate). One of the ideas is > to find a general way to run suites faster by running tests in parallel. To > find effective parallelization strategies, we need data on test run times. > Rather than build big infrastructure to do this (which would undoubtedly be > cool), I'd like to start with the simplest thing that could possibly work. > So: > > If you have a long-running test suite and > You run it using Ant and > You can use the XML formattter ("<formatter type="xml"/>") and > You don't mind sharing your test names with me confidentially > > Would you please zip your reports and email them to me. If they're too big > for email, please let me know and we'll figure out a backup plan. I'd > appreciate any context you can provide--how long the suite has been in > development, the experience level of the developers, whatever else you > we might need to know. > > The first data set I looked at was from DevCreek-->90M test runs from > production coding representing more than 50 person-years of development. To > my surprise, the test runs exhibit a power law distribution (way lots of > fast tests, a few very long running tests, plot a histogram log-log and you > get a straight line). I have no idea what this means, but it brings to mind > the Asimov quote, "The most exciting phrase to hear in science, the one that > heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny > ...'" I've found many power law distributions in the static and dynamic > structure of code, but the mechanisms influencing test run times seem to be > completely different than those influencing code structure. > > Anyway, I'd love to validate those findings. The Ant XML format seems like a > good place to start. Alternatively, you could send me one or more files with > test run times one per line. > > Questions and comments appreciated. > > Yours in science, > > Kent Beck > Three Rivers Institute > > [Non-text portions of this message have been removed] |
| Free Forum Powered by Nabble | Forum Help |