How to run long (>10 hours running time) verification test for continuous integration of scientific...
This question continues on a previous question asked here about continuous integration for scientific software. Like the poster of that question, I am developing software for numerical simulations and I am in the process of applying continuous integration (CI).
My main problem is that I have tests that are used for verification that have to run for a long time (>10 hours). Also, those tests require the use high performance computing resources (an HPC cluster).
From what I have read so far, the idea behind CI is to make sure that a merge cannot be executed in the remote repository, if:
- the build has failed (this is not a problem)
- a verification case is broken by the pull request.
Testing for the build success is possible on the server that is hosting git (Bitbucket, Gitlab, etc), because the code can be compiled rather quickly (order of minutes).
Testing for the verification cases would require the remote git repository server to communicate with the HPC cluster and run simulations there until it is certain that no verification case is broken.
I am using Bitbucket as the remote git server, and I was reading about Bitbucket pipelines, Travis, Jenkins, etc.
I have found information about integrating Bitbucket and Jenkins: here and here.
The problem I see with the use of another server for running the tests is the authentication (security). The users of the HPC cluster are accessing this machine via SSH. The HPC cluster manages the execution of simulations with a help of the workload manager that schedules simulations as jobs that are described with a queue, priority and status.
If I use Jenkins to submit a job on the HPC cluster using the ssh plugin, the script will submit a simulation and exit with 0, if the submission was successful. This does not mean that the test has succeeded, because the simulation can take hours to complete.
Also, if the Jenkins server is to use SSH to connect to the HPC cluster, it needs the public SSH key. I haven't found the way for Bitbucket to communicate this information to Jenkins.
Has anyone tried to use continuous integration with tests that run for hours/days?
Edit : The responses to the question address the fact that one cannot wait for 10 hours for his/her commit to be accepted. This is not the plan: the idea is to run the whole test suite, when a pull request is submitted to the main upstream repository, to make sure that nothing submitted in the pull request breaks what has already been implemented. Same tests can (and should) be run manually by the devs on the HPC cluster before even submitting the pull request. In my field, the pull request means a numerical agorithm has been developed and tested, this happens maybe once in a month.
continuous-integration integration-tests
|
show 3 more comments
This question continues on a previous question asked here about continuous integration for scientific software. Like the poster of that question, I am developing software for numerical simulations and I am in the process of applying continuous integration (CI).
My main problem is that I have tests that are used for verification that have to run for a long time (>10 hours). Also, those tests require the use high performance computing resources (an HPC cluster).
From what I have read so far, the idea behind CI is to make sure that a merge cannot be executed in the remote repository, if:
- the build has failed (this is not a problem)
- a verification case is broken by the pull request.
Testing for the build success is possible on the server that is hosting git (Bitbucket, Gitlab, etc), because the code can be compiled rather quickly (order of minutes).
Testing for the verification cases would require the remote git repository server to communicate with the HPC cluster and run simulations there until it is certain that no verification case is broken.
I am using Bitbucket as the remote git server, and I was reading about Bitbucket pipelines, Travis, Jenkins, etc.
I have found information about integrating Bitbucket and Jenkins: here and here.
The problem I see with the use of another server for running the tests is the authentication (security). The users of the HPC cluster are accessing this machine via SSH. The HPC cluster manages the execution of simulations with a help of the workload manager that schedules simulations as jobs that are described with a queue, priority and status.
If I use Jenkins to submit a job on the HPC cluster using the ssh plugin, the script will submit a simulation and exit with 0, if the submission was successful. This does not mean that the test has succeeded, because the simulation can take hours to complete.
Also, if the Jenkins server is to use SSH to connect to the HPC cluster, it needs the public SSH key. I haven't found the way for Bitbucket to communicate this information to Jenkins.
Has anyone tried to use continuous integration with tests that run for hours/days?
Edit : The responses to the question address the fact that one cannot wait for 10 hours for his/her commit to be accepted. This is not the plan: the idea is to run the whole test suite, when a pull request is submitted to the main upstream repository, to make sure that nothing submitted in the pull request breaks what has already been implemented. Same tests can (and should) be run manually by the devs on the HPC cluster before even submitting the pull request. In my field, the pull request means a numerical agorithm has been developed and tested, this happens maybe once in a month.
continuous-integration integration-tests
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
2
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
2
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
1
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
2
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49
|
show 3 more comments
This question continues on a previous question asked here about continuous integration for scientific software. Like the poster of that question, I am developing software for numerical simulations and I am in the process of applying continuous integration (CI).
My main problem is that I have tests that are used for verification that have to run for a long time (>10 hours). Also, those tests require the use high performance computing resources (an HPC cluster).
From what I have read so far, the idea behind CI is to make sure that a merge cannot be executed in the remote repository, if:
- the build has failed (this is not a problem)
- a verification case is broken by the pull request.
Testing for the build success is possible on the server that is hosting git (Bitbucket, Gitlab, etc), because the code can be compiled rather quickly (order of minutes).
Testing for the verification cases would require the remote git repository server to communicate with the HPC cluster and run simulations there until it is certain that no verification case is broken.
I am using Bitbucket as the remote git server, and I was reading about Bitbucket pipelines, Travis, Jenkins, etc.
I have found information about integrating Bitbucket and Jenkins: here and here.
The problem I see with the use of another server for running the tests is the authentication (security). The users of the HPC cluster are accessing this machine via SSH. The HPC cluster manages the execution of simulations with a help of the workload manager that schedules simulations as jobs that are described with a queue, priority and status.
If I use Jenkins to submit a job on the HPC cluster using the ssh plugin, the script will submit a simulation and exit with 0, if the submission was successful. This does not mean that the test has succeeded, because the simulation can take hours to complete.
Also, if the Jenkins server is to use SSH to connect to the HPC cluster, it needs the public SSH key. I haven't found the way for Bitbucket to communicate this information to Jenkins.
Has anyone tried to use continuous integration with tests that run for hours/days?
Edit : The responses to the question address the fact that one cannot wait for 10 hours for his/her commit to be accepted. This is not the plan: the idea is to run the whole test suite, when a pull request is submitted to the main upstream repository, to make sure that nothing submitted in the pull request breaks what has already been implemented. Same tests can (and should) be run manually by the devs on the HPC cluster before even submitting the pull request. In my field, the pull request means a numerical agorithm has been developed and tested, this happens maybe once in a month.
continuous-integration integration-tests
This question continues on a previous question asked here about continuous integration for scientific software. Like the poster of that question, I am developing software for numerical simulations and I am in the process of applying continuous integration (CI).
My main problem is that I have tests that are used for verification that have to run for a long time (>10 hours). Also, those tests require the use high performance computing resources (an HPC cluster).
From what I have read so far, the idea behind CI is to make sure that a merge cannot be executed in the remote repository, if:
- the build has failed (this is not a problem)
- a verification case is broken by the pull request.
Testing for the build success is possible on the server that is hosting git (Bitbucket, Gitlab, etc), because the code can be compiled rather quickly (order of minutes).
Testing for the verification cases would require the remote git repository server to communicate with the HPC cluster and run simulations there until it is certain that no verification case is broken.
I am using Bitbucket as the remote git server, and I was reading about Bitbucket pipelines, Travis, Jenkins, etc.
I have found information about integrating Bitbucket and Jenkins: here and here.
The problem I see with the use of another server for running the tests is the authentication (security). The users of the HPC cluster are accessing this machine via SSH. The HPC cluster manages the execution of simulations with a help of the workload manager that schedules simulations as jobs that are described with a queue, priority and status.
If I use Jenkins to submit a job on the HPC cluster using the ssh plugin, the script will submit a simulation and exit with 0, if the submission was successful. This does not mean that the test has succeeded, because the simulation can take hours to complete.
Also, if the Jenkins server is to use SSH to connect to the HPC cluster, it needs the public SSH key. I haven't found the way for Bitbucket to communicate this information to Jenkins.
Has anyone tried to use continuous integration with tests that run for hours/days?
Edit : The responses to the question address the fact that one cannot wait for 10 hours for his/her commit to be accepted. This is not the plan: the idea is to run the whole test suite, when a pull request is submitted to the main upstream repository, to make sure that nothing submitted in the pull request breaks what has already been implemented. Same tests can (and should) be run manually by the devs on the HPC cluster before even submitting the pull request. In my field, the pull request means a numerical agorithm has been developed and tested, this happens maybe once in a month.
continuous-integration integration-tests
continuous-integration integration-tests
edited Dec 10 at 20:12
Iwan Aucamp
1033
1033
asked Dec 9 at 13:08
tmaric
1314
1314
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
2
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
2
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
1
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
2
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49
|
show 3 more comments
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
2
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
2
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
1
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
2
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
2
2
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
2
2
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
1
1
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
2
2
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49
|
show 3 more comments
3 Answers
3
active
oldest
votes
I believe you have three issues here.
- Executing asynchronous tests
- Security Credentials
- Quality Gateways
Asynchronous Tests
While not ideal, Jenkins was designed for synchronous work loads. There is definitely a mismatch here.
So my first question is how do you know when the tests have completed for better or worse?
- If you have to poll a directory or a service for status and results, simply start that polling as the next step after job submission.
- If you receive an event, split the pipeline up into two segments. Have the second segment triggered by receiving that event. You may need to write a small program to map the event sent from your cluster into a web request to trigger the next stage in jenkins.
Security Credentials
Jenkins is capable of storing the key information securely using its internal credential management. This will require setting up a specific key for Jenkins. This may have a plus side in allowing usage data to be collected on, and maybe even limitations to be imposed on the Jenkins specific account on the compute cluster.
Quality Gateways
Continuous Integration pipelines are built around providing feed-back as quickly as possible.
So when you say that testing takes ten hours, I wince because I am hearing ten hours till any results are available. Now taking ten hours to complete testing is fine, indeed there are projects out there that take days to completely test. The point is that Continuous means a stream of information, it needs to be available in a timely manner not ten hours later.
Think of the pipeline as raising a Quality meter. Each stage in the pipeline pushes that meter higher. You want to schedule your stages in such a way that you discover errors early, by essentially pushing the quality meter up as quickly as possible.
This maximises the likelihood that the developer is still thinking about that problem and can quickly fix it. It also maximises the total resource saving possible, by discovering that the build is bad earlier, and not pursuing further verification. Also the pull request can be rejected sooner, as any error is a show stopper.
Test Suites
So try and split your tests up into short running test suites. Anywhere between 15 minutes and 1 hour are good lengths. Much shorter and the overhead becomes burdensome, much longer and the results aren't keeping that continuity of information flow. Try and schedule the shorter tests first (for that quick turnaround), but balance that up against the total time to completion. It may make sense to operate a single queue of faster tests, and parallelise the longer tests alongside.
Smoke Tests
Smoke tests are another avenue for reducing the time to discover an error, by reducing the size of the problem and running a single variation of an algorithm up front. The test will run faster than its heavier weight cousins and will exercise much of the mechanics as well. It is not as complete as the heavier tests, but will locate obvious issues sooner. Again balance these up as they don't replace your current tests, they serve to outline important sections in the hopes of identifying breakages earlier.
Reduce Platform Dependence
While you still need that final verification step on the compute cluster, most errors are detectable as unit tests. The more of the algorithm you can cover-off in unit testing the quicker you can prove confidence that it has not been broken. As a bonus unit tests don't need the compute cluster in order to be run.
add a comment |
It's great that you are trying to implement some kind of continuous integration, but here it does not seem a good fit. It is not possible to run the complete test suite before merging any changes, at least not without hurting productivity noticeably. You therefore need to consider whether the benefits of these slow tests outweigh their costs (here, possibly literal costs for time on the cluster).
You can then devise a strategy to reduce the costs. For example:
- run the full test suite less frequently, e.g. every night or before a release.
- use a smaller test suite for CI. The focus of this test suite is not demonstrating that it works, but just catching obvious problems early (a kind of smoke testing).
There are lots of possibilities to create a smaller, faster test suite:
- If the test suite consists of multiple problems, only select a subset for the CI tests, possibly at random.
- Choose smaller problem sizes. E.g. reduce the resolution or duration of simulations. Use sampled data sets.
- Focus on component tests rather than complete runs of the software.
To be clear: it is totally fine not to do any CI testing. That wouldn't be great, but it can be a valid decision if you're aware of the risks. (Primarily, the risk that the software was broken without anyone noticing). But if the only way to make a commit is to wait hours or even days, that might be worse than no tests at all. As long as you still run the complete test suite regularly, you can hedge the risk that things break without notice. While you won't be alerted before the defect is merged, you'll still be alerted close to the cause of the problem.
Your technical difficulties of configuring Jenkins are minor in this regard. Possibly your HPC job submission mechanism implies that while Jenkins is suitable to kick off HPC jobs, Jenkins might not be a suitable platform for aggregating test results or gating branch merges. This might not be a huge problem e.g. if you give up the goal of CI-style gated merges, and instead settle for nightly tests. Then, it might be sufficient if the devs find the results as an email the next morning.
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
add a comment |
Continuous integration with tests that run for hours/days is not continuous integration.
Seriously you've just taken us back to the days of nightly builds. Now sure there is no LOGICAL reason a test has to complete in a timely manner. But the humans lose the benefits of continuous integration if you do this.
Continuous integration ensures that integration is performed by the same person who made these changes. It forces coders to look at how their stuff impacts other stuff before they break that stuff.
If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration. I don't care what tools you used.
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "131"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f382714%2fhow-to-run-long-10-hours-running-time-verification-test-for-continuous-integr%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I believe you have three issues here.
- Executing asynchronous tests
- Security Credentials
- Quality Gateways
Asynchronous Tests
While not ideal, Jenkins was designed for synchronous work loads. There is definitely a mismatch here.
So my first question is how do you know when the tests have completed for better or worse?
- If you have to poll a directory or a service for status and results, simply start that polling as the next step after job submission.
- If you receive an event, split the pipeline up into two segments. Have the second segment triggered by receiving that event. You may need to write a small program to map the event sent from your cluster into a web request to trigger the next stage in jenkins.
Security Credentials
Jenkins is capable of storing the key information securely using its internal credential management. This will require setting up a specific key for Jenkins. This may have a plus side in allowing usage data to be collected on, and maybe even limitations to be imposed on the Jenkins specific account on the compute cluster.
Quality Gateways
Continuous Integration pipelines are built around providing feed-back as quickly as possible.
So when you say that testing takes ten hours, I wince because I am hearing ten hours till any results are available. Now taking ten hours to complete testing is fine, indeed there are projects out there that take days to completely test. The point is that Continuous means a stream of information, it needs to be available in a timely manner not ten hours later.
Think of the pipeline as raising a Quality meter. Each stage in the pipeline pushes that meter higher. You want to schedule your stages in such a way that you discover errors early, by essentially pushing the quality meter up as quickly as possible.
This maximises the likelihood that the developer is still thinking about that problem and can quickly fix it. It also maximises the total resource saving possible, by discovering that the build is bad earlier, and not pursuing further verification. Also the pull request can be rejected sooner, as any error is a show stopper.
Test Suites
So try and split your tests up into short running test suites. Anywhere between 15 minutes and 1 hour are good lengths. Much shorter and the overhead becomes burdensome, much longer and the results aren't keeping that continuity of information flow. Try and schedule the shorter tests first (for that quick turnaround), but balance that up against the total time to completion. It may make sense to operate a single queue of faster tests, and parallelise the longer tests alongside.
Smoke Tests
Smoke tests are another avenue for reducing the time to discover an error, by reducing the size of the problem and running a single variation of an algorithm up front. The test will run faster than its heavier weight cousins and will exercise much of the mechanics as well. It is not as complete as the heavier tests, but will locate obvious issues sooner. Again balance these up as they don't replace your current tests, they serve to outline important sections in the hopes of identifying breakages earlier.
Reduce Platform Dependence
While you still need that final verification step on the compute cluster, most errors are detectable as unit tests. The more of the algorithm you can cover-off in unit testing the quicker you can prove confidence that it has not been broken. As a bonus unit tests don't need the compute cluster in order to be run.
add a comment |
I believe you have three issues here.
- Executing asynchronous tests
- Security Credentials
- Quality Gateways
Asynchronous Tests
While not ideal, Jenkins was designed for synchronous work loads. There is definitely a mismatch here.
So my first question is how do you know when the tests have completed for better or worse?
- If you have to poll a directory or a service for status and results, simply start that polling as the next step after job submission.
- If you receive an event, split the pipeline up into two segments. Have the second segment triggered by receiving that event. You may need to write a small program to map the event sent from your cluster into a web request to trigger the next stage in jenkins.
Security Credentials
Jenkins is capable of storing the key information securely using its internal credential management. This will require setting up a specific key for Jenkins. This may have a plus side in allowing usage data to be collected on, and maybe even limitations to be imposed on the Jenkins specific account on the compute cluster.
Quality Gateways
Continuous Integration pipelines are built around providing feed-back as quickly as possible.
So when you say that testing takes ten hours, I wince because I am hearing ten hours till any results are available. Now taking ten hours to complete testing is fine, indeed there are projects out there that take days to completely test. The point is that Continuous means a stream of information, it needs to be available in a timely manner not ten hours later.
Think of the pipeline as raising a Quality meter. Each stage in the pipeline pushes that meter higher. You want to schedule your stages in such a way that you discover errors early, by essentially pushing the quality meter up as quickly as possible.
This maximises the likelihood that the developer is still thinking about that problem and can quickly fix it. It also maximises the total resource saving possible, by discovering that the build is bad earlier, and not pursuing further verification. Also the pull request can be rejected sooner, as any error is a show stopper.
Test Suites
So try and split your tests up into short running test suites. Anywhere between 15 minutes and 1 hour are good lengths. Much shorter and the overhead becomes burdensome, much longer and the results aren't keeping that continuity of information flow. Try and schedule the shorter tests first (for that quick turnaround), but balance that up against the total time to completion. It may make sense to operate a single queue of faster tests, and parallelise the longer tests alongside.
Smoke Tests
Smoke tests are another avenue for reducing the time to discover an error, by reducing the size of the problem and running a single variation of an algorithm up front. The test will run faster than its heavier weight cousins and will exercise much of the mechanics as well. It is not as complete as the heavier tests, but will locate obvious issues sooner. Again balance these up as they don't replace your current tests, they serve to outline important sections in the hopes of identifying breakages earlier.
Reduce Platform Dependence
While you still need that final verification step on the compute cluster, most errors are detectable as unit tests. The more of the algorithm you can cover-off in unit testing the quicker you can prove confidence that it has not been broken. As a bonus unit tests don't need the compute cluster in order to be run.
add a comment |
I believe you have three issues here.
- Executing asynchronous tests
- Security Credentials
- Quality Gateways
Asynchronous Tests
While not ideal, Jenkins was designed for synchronous work loads. There is definitely a mismatch here.
So my first question is how do you know when the tests have completed for better or worse?
- If you have to poll a directory or a service for status and results, simply start that polling as the next step after job submission.
- If you receive an event, split the pipeline up into two segments. Have the second segment triggered by receiving that event. You may need to write a small program to map the event sent from your cluster into a web request to trigger the next stage in jenkins.
Security Credentials
Jenkins is capable of storing the key information securely using its internal credential management. This will require setting up a specific key for Jenkins. This may have a plus side in allowing usage data to be collected on, and maybe even limitations to be imposed on the Jenkins specific account on the compute cluster.
Quality Gateways
Continuous Integration pipelines are built around providing feed-back as quickly as possible.
So when you say that testing takes ten hours, I wince because I am hearing ten hours till any results are available. Now taking ten hours to complete testing is fine, indeed there are projects out there that take days to completely test. The point is that Continuous means a stream of information, it needs to be available in a timely manner not ten hours later.
Think of the pipeline as raising a Quality meter. Each stage in the pipeline pushes that meter higher. You want to schedule your stages in such a way that you discover errors early, by essentially pushing the quality meter up as quickly as possible.
This maximises the likelihood that the developer is still thinking about that problem and can quickly fix it. It also maximises the total resource saving possible, by discovering that the build is bad earlier, and not pursuing further verification. Also the pull request can be rejected sooner, as any error is a show stopper.
Test Suites
So try and split your tests up into short running test suites. Anywhere between 15 minutes and 1 hour are good lengths. Much shorter and the overhead becomes burdensome, much longer and the results aren't keeping that continuity of information flow. Try and schedule the shorter tests first (for that quick turnaround), but balance that up against the total time to completion. It may make sense to operate a single queue of faster tests, and parallelise the longer tests alongside.
Smoke Tests
Smoke tests are another avenue for reducing the time to discover an error, by reducing the size of the problem and running a single variation of an algorithm up front. The test will run faster than its heavier weight cousins and will exercise much of the mechanics as well. It is not as complete as the heavier tests, but will locate obvious issues sooner. Again balance these up as they don't replace your current tests, they serve to outline important sections in the hopes of identifying breakages earlier.
Reduce Platform Dependence
While you still need that final verification step on the compute cluster, most errors are detectable as unit tests. The more of the algorithm you can cover-off in unit testing the quicker you can prove confidence that it has not been broken. As a bonus unit tests don't need the compute cluster in order to be run.
I believe you have three issues here.
- Executing asynchronous tests
- Security Credentials
- Quality Gateways
Asynchronous Tests
While not ideal, Jenkins was designed for synchronous work loads. There is definitely a mismatch here.
So my first question is how do you know when the tests have completed for better or worse?
- If you have to poll a directory or a service for status and results, simply start that polling as the next step after job submission.
- If you receive an event, split the pipeline up into two segments. Have the second segment triggered by receiving that event. You may need to write a small program to map the event sent from your cluster into a web request to trigger the next stage in jenkins.
Security Credentials
Jenkins is capable of storing the key information securely using its internal credential management. This will require setting up a specific key for Jenkins. This may have a plus side in allowing usage data to be collected on, and maybe even limitations to be imposed on the Jenkins specific account on the compute cluster.
Quality Gateways
Continuous Integration pipelines are built around providing feed-back as quickly as possible.
So when you say that testing takes ten hours, I wince because I am hearing ten hours till any results are available. Now taking ten hours to complete testing is fine, indeed there are projects out there that take days to completely test. The point is that Continuous means a stream of information, it needs to be available in a timely manner not ten hours later.
Think of the pipeline as raising a Quality meter. Each stage in the pipeline pushes that meter higher. You want to schedule your stages in such a way that you discover errors early, by essentially pushing the quality meter up as quickly as possible.
This maximises the likelihood that the developer is still thinking about that problem and can quickly fix it. It also maximises the total resource saving possible, by discovering that the build is bad earlier, and not pursuing further verification. Also the pull request can be rejected sooner, as any error is a show stopper.
Test Suites
So try and split your tests up into short running test suites. Anywhere between 15 minutes and 1 hour are good lengths. Much shorter and the overhead becomes burdensome, much longer and the results aren't keeping that continuity of information flow. Try and schedule the shorter tests first (for that quick turnaround), but balance that up against the total time to completion. It may make sense to operate a single queue of faster tests, and parallelise the longer tests alongside.
Smoke Tests
Smoke tests are another avenue for reducing the time to discover an error, by reducing the size of the problem and running a single variation of an algorithm up front. The test will run faster than its heavier weight cousins and will exercise much of the mechanics as well. It is not as complete as the heavier tests, but will locate obvious issues sooner. Again balance these up as they don't replace your current tests, they serve to outline important sections in the hopes of identifying breakages earlier.
Reduce Platform Dependence
While you still need that final verification step on the compute cluster, most errors are detectable as unit tests. The more of the algorithm you can cover-off in unit testing the quicker you can prove confidence that it has not been broken. As a bonus unit tests don't need the compute cluster in order to be run.
edited Dec 10 at 6:55
answered Dec 10 at 6:49
Kain0_0
1,582110
1,582110
add a comment |
add a comment |
It's great that you are trying to implement some kind of continuous integration, but here it does not seem a good fit. It is not possible to run the complete test suite before merging any changes, at least not without hurting productivity noticeably. You therefore need to consider whether the benefits of these slow tests outweigh their costs (here, possibly literal costs for time on the cluster).
You can then devise a strategy to reduce the costs. For example:
- run the full test suite less frequently, e.g. every night or before a release.
- use a smaller test suite for CI. The focus of this test suite is not demonstrating that it works, but just catching obvious problems early (a kind of smoke testing).
There are lots of possibilities to create a smaller, faster test suite:
- If the test suite consists of multiple problems, only select a subset for the CI tests, possibly at random.
- Choose smaller problem sizes. E.g. reduce the resolution or duration of simulations. Use sampled data sets.
- Focus on component tests rather than complete runs of the software.
To be clear: it is totally fine not to do any CI testing. That wouldn't be great, but it can be a valid decision if you're aware of the risks. (Primarily, the risk that the software was broken without anyone noticing). But if the only way to make a commit is to wait hours or even days, that might be worse than no tests at all. As long as you still run the complete test suite regularly, you can hedge the risk that things break without notice. While you won't be alerted before the defect is merged, you'll still be alerted close to the cause of the problem.
Your technical difficulties of configuring Jenkins are minor in this regard. Possibly your HPC job submission mechanism implies that while Jenkins is suitable to kick off HPC jobs, Jenkins might not be a suitable platform for aggregating test results or gating branch merges. This might not be a huge problem e.g. if you give up the goal of CI-style gated merges, and instead settle for nightly tests. Then, it might be sufficient if the devs find the results as an email the next morning.
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
add a comment |
It's great that you are trying to implement some kind of continuous integration, but here it does not seem a good fit. It is not possible to run the complete test suite before merging any changes, at least not without hurting productivity noticeably. You therefore need to consider whether the benefits of these slow tests outweigh their costs (here, possibly literal costs for time on the cluster).
You can then devise a strategy to reduce the costs. For example:
- run the full test suite less frequently, e.g. every night or before a release.
- use a smaller test suite for CI. The focus of this test suite is not demonstrating that it works, but just catching obvious problems early (a kind of smoke testing).
There are lots of possibilities to create a smaller, faster test suite:
- If the test suite consists of multiple problems, only select a subset for the CI tests, possibly at random.
- Choose smaller problem sizes. E.g. reduce the resolution or duration of simulations. Use sampled data sets.
- Focus on component tests rather than complete runs of the software.
To be clear: it is totally fine not to do any CI testing. That wouldn't be great, but it can be a valid decision if you're aware of the risks. (Primarily, the risk that the software was broken without anyone noticing). But if the only way to make a commit is to wait hours or even days, that might be worse than no tests at all. As long as you still run the complete test suite regularly, you can hedge the risk that things break without notice. While you won't be alerted before the defect is merged, you'll still be alerted close to the cause of the problem.
Your technical difficulties of configuring Jenkins are minor in this regard. Possibly your HPC job submission mechanism implies that while Jenkins is suitable to kick off HPC jobs, Jenkins might not be a suitable platform for aggregating test results or gating branch merges. This might not be a huge problem e.g. if you give up the goal of CI-style gated merges, and instead settle for nightly tests. Then, it might be sufficient if the devs find the results as an email the next morning.
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
add a comment |
It's great that you are trying to implement some kind of continuous integration, but here it does not seem a good fit. It is not possible to run the complete test suite before merging any changes, at least not without hurting productivity noticeably. You therefore need to consider whether the benefits of these slow tests outweigh their costs (here, possibly literal costs for time on the cluster).
You can then devise a strategy to reduce the costs. For example:
- run the full test suite less frequently, e.g. every night or before a release.
- use a smaller test suite for CI. The focus of this test suite is not demonstrating that it works, but just catching obvious problems early (a kind of smoke testing).
There are lots of possibilities to create a smaller, faster test suite:
- If the test suite consists of multiple problems, only select a subset for the CI tests, possibly at random.
- Choose smaller problem sizes. E.g. reduce the resolution or duration of simulations. Use sampled data sets.
- Focus on component tests rather than complete runs of the software.
To be clear: it is totally fine not to do any CI testing. That wouldn't be great, but it can be a valid decision if you're aware of the risks. (Primarily, the risk that the software was broken without anyone noticing). But if the only way to make a commit is to wait hours or even days, that might be worse than no tests at all. As long as you still run the complete test suite regularly, you can hedge the risk that things break without notice. While you won't be alerted before the defect is merged, you'll still be alerted close to the cause of the problem.
Your technical difficulties of configuring Jenkins are minor in this regard. Possibly your HPC job submission mechanism implies that while Jenkins is suitable to kick off HPC jobs, Jenkins might not be a suitable platform for aggregating test results or gating branch merges. This might not be a huge problem e.g. if you give up the goal of CI-style gated merges, and instead settle for nightly tests. Then, it might be sufficient if the devs find the results as an email the next morning.
It's great that you are trying to implement some kind of continuous integration, but here it does not seem a good fit. It is not possible to run the complete test suite before merging any changes, at least not without hurting productivity noticeably. You therefore need to consider whether the benefits of these slow tests outweigh their costs (here, possibly literal costs for time on the cluster).
You can then devise a strategy to reduce the costs. For example:
- run the full test suite less frequently, e.g. every night or before a release.
- use a smaller test suite for CI. The focus of this test suite is not demonstrating that it works, but just catching obvious problems early (a kind of smoke testing).
There are lots of possibilities to create a smaller, faster test suite:
- If the test suite consists of multiple problems, only select a subset for the CI tests, possibly at random.
- Choose smaller problem sizes. E.g. reduce the resolution or duration of simulations. Use sampled data sets.
- Focus on component tests rather than complete runs of the software.
To be clear: it is totally fine not to do any CI testing. That wouldn't be great, but it can be a valid decision if you're aware of the risks. (Primarily, the risk that the software was broken without anyone noticing). But if the only way to make a commit is to wait hours or even days, that might be worse than no tests at all. As long as you still run the complete test suite regularly, you can hedge the risk that things break without notice. While you won't be alerted before the defect is merged, you'll still be alerted close to the cause of the problem.
Your technical difficulties of configuring Jenkins are minor in this regard. Possibly your HPC job submission mechanism implies that while Jenkins is suitable to kick off HPC jobs, Jenkins might not be a suitable platform for aggregating test results or gating branch merges. This might not be a huge problem e.g. if you give up the goal of CI-style gated merges, and instead settle for nightly tests. Then, it might be sufficient if the devs find the results as an email the next morning.
answered Dec 9 at 17:46
amon
84.8k21163250
84.8k21163250
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
add a comment |
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
2
2
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
// But if the only way to make a commit is to wait hours or even days // I suppose individuals would be allowed to commit to their repository fork however they like, without going through verification, unless they specifically request a verification run. In other words, the verification testing should only become mandatory when a pull request is being reviewed. Verification testing and human code review should happen side-by-side.
– rwong
Dec 9 at 17:56
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
@rwong: that is basically what I am thinking about. I just want to make sure that a pull request that contains changes of numerical algorithms does not break any of the previous algorithm verification tests.
– tmaric
Dec 9 at 18:34
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
I cannot choose random tests, there are tests that simply must run, otherwise serious damage has occured. I can probably reduce their running time though, and hope that with higher input resolutions, things are also fine, until a periodical build + test happens.
– tmaric
Dec 9 at 18:35
add a comment |
Continuous integration with tests that run for hours/days is not continuous integration.
Seriously you've just taken us back to the days of nightly builds. Now sure there is no LOGICAL reason a test has to complete in a timely manner. But the humans lose the benefits of continuous integration if you do this.
Continuous integration ensures that integration is performed by the same person who made these changes. It forces coders to look at how their stuff impacts other stuff before they break that stuff.
If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration. I don't care what tools you used.
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
add a comment |
Continuous integration with tests that run for hours/days is not continuous integration.
Seriously you've just taken us back to the days of nightly builds. Now sure there is no LOGICAL reason a test has to complete in a timely manner. But the humans lose the benefits of continuous integration if you do this.
Continuous integration ensures that integration is performed by the same person who made these changes. It forces coders to look at how their stuff impacts other stuff before they break that stuff.
If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration. I don't care what tools you used.
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
add a comment |
Continuous integration with tests that run for hours/days is not continuous integration.
Seriously you've just taken us back to the days of nightly builds. Now sure there is no LOGICAL reason a test has to complete in a timely manner. But the humans lose the benefits of continuous integration if you do this.
Continuous integration ensures that integration is performed by the same person who made these changes. It forces coders to look at how their stuff impacts other stuff before they break that stuff.
If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration. I don't care what tools you used.
Continuous integration with tests that run for hours/days is not continuous integration.
Seriously you've just taken us back to the days of nightly builds. Now sure there is no LOGICAL reason a test has to complete in a timely manner. But the humans lose the benefits of continuous integration if you do this.
Continuous integration ensures that integration is performed by the same person who made these changes. It forces coders to look at how their stuff impacts other stuff before they break that stuff.
If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration. I don't care what tools you used.
answered Dec 9 at 16:39
candied_orange
52.3k1697185
52.3k1697185
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
add a comment |
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
"If I can turn stuff in and go on vacation and completely miss the fall out caused by my stuff, stop calling it continuous integration." If a pull request in the main remote repo cannot be accepted before the full-scale tests show numerical convergence, and you know that the tests take at least a day, then you submit your pull request 2 weeks before going to holiday, instead of the morning of the last day, so that there is time to fix the problems. Either that, or your unmerged pull request is waiting for you when you get back.
– tmaric
Dec 9 at 18:45
add a comment |
Thanks for contributing an answer to Software Engineering Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f382714%2fhow-to-run-long-10-hours-running-time-verification-test-for-continuous-integr%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If you run the verification tests manually, how are you informed that the tests have finished and what the results are? Would it be possible to have an alternative submission script that waits till the job has finished executing?
– Bart van Ingen Schenau
Dec 9 at 14:02
2
Occasionally in a former team of mine, we'd do things like test offline rendering in very complex scenes with all the gory bells and whistles (lots of indirect lighting with high bounces, very high samples, etc). That took sometimes like 10 hours to render. What we did in that case was dedicate a separate machine/script which would just run on its own independently, pick up the latest changes from version control, build, and render away once a day or so. It might miss some commits in between but would pick up if the render times were negatively affected or the correctness of the results.
– Dragon Energy
Dec 9 at 16:42
2
And that was a separate process from the usual kind of CI. It was just a simple dedicated machine/process which would, after ages spent rendering, pick up the latest changes from version control periodically. If the changeset is different from before, then it'd go back to spending ages rendering again. And in that case we actually verified the output manually since it was just a once-a-day process or so and with offline rendering there of the kind we had, there are all sorts of speed hacks which skew results a bit but the idea of "correctness" is kind of tied to what "looks acceptable".
– Dragon Energy
Dec 9 at 16:46
1
Take a look at the OpenCV buildbot environment. The framework chosen by OpenCV to manage theirs is called, simply, BuildBot (URL: buildbot.net) The essence is that, for software where testing is long (all large-scale numerically intensive algorithms), compiling and testing are slightly asynchronous from version control. Version control is still possible thanks to the pull-request model. If sufficient machines are dedicated toward testing, there is no need to omit some commits or pull-requests.
– rwong
Dec 9 at 17:46
2
However, there is a general rule-of-thumb for the upper limit for asynchronous build verification: around 15 minutes. If it is longer than that, consequences mentioned in candied_orange's answer creep in.
– rwong
Dec 9 at 17:49