How long should tasks run?

Message boards : Number crunching : How long should tasks run?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
zombie67 [MM]
Avatar

Send message
Joined: 11 Jun 18
Posts: 7
Credit: 111,259,937
RAC: 0
Message 8 - Posted: 15 Jun 2018, 14:10:04 UTC

I have some on my Mac that have been running over 20 hours now. What is "normal"?
Dublin, California
Team: SETI.USA
ID: 8 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 11 Jun 18
Posts: 26
Credit: 7,533,427
RAC: 2
Message 16 - Posted: 15 Jun 2018, 22:12:18 UTC
Last modified: 15 Jun 2018, 22:13:49 UTC

Very strange behavior on my Mac, when I came back home I had quite a lot of apps opened like this one :



When I did "about" I had something about java. I thought about the project, and I also had several tasks of the project running in boinc (for more than 12 hours, they seemed to have checkpoint implementer but advance % was 0)...

But because I don't like apps to open by themselves and it seemed to me there were more of these than running tasks I said "what the heck" and I closed them all. And obviously the project tasks "crashed" just after that (ie most probably were terminated by me this way) so it looks like the project needs this program to run in "front end", it seems this is only a display of the execution of the tasks (there no action / options / buttons of any sort in there), but a required display.

http://dhep.ga/boinc/results.php?hostid=69, the state is "EXIT_ABORTED_BY_CLIENT"

I had one remaining to I let it run, we'll see tomorrow how it goes.
ID: 16 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP

Send message
Joined: 5 Jun 18
Posts: 4
Credit: 0
RAC: 0
Message 19 - Posted: 16 Jun 2018, 0:21:43 UTC - in response to Message 16.  
Last modified: 16 Jun 2018, 1:02:47 UTC

Hello, just for the record. The GUI is not required for the client to run. You can safely close these (just close the Windows not any icon representing the Java process). Actually we weren't expecting them to show up as it was my understanding that a special flag/executable name was required to have GUIs within BOINC. We'll consider adding some kind of option to switch this on/off in future, but will focus on getting the basics running now.

If you want to know what that all means: http://dhep.ga/faq.php#gui
ID: 19 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP

Send message
Joined: 5 Jun 18
Posts: 4
Credit: 0
RAC: 0
Message 20 - Posted: 16 Jun 2018, 0:28:33 UTC - in response to Message 8.  
Last modified: 16 Jun 2018, 0:29:03 UTC

A GA has no end so to speak. Just like we are all continuing to evolve right now.

Progress is sent directly to our servers and credit will be allocated via the trickle system. At the moment the BOINC wrapper is crashing out with its cheat detection (.95 percentile) and we're not entirely sure what is triggering this.

The tasks will in fact terminate when the goal of the GA is changed. So a couple of days ago we moved from tackling the m1 benchmark to misex1. Happily evolution found a Totally Self-Checking version of m1 using only 40% of the resources required by conventional design techniques.
ID: 20 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 11 Jun 18
Posts: 26
Credit: 7,533,427
RAC: 2
Message 26 - Posted: 16 Jun 2018, 10:36:43 UTC

Not sure I understood all your explication but I'm happy to see you are coming here to tell them, and more that you are working on the dev and support !

I finally had 3 tasks terminating : I know it thanks to BoincTasks who tracks task history, because it seems you purge very quickly the task history and mine is already empty (this is not a good thing).

Therefore I lost track of my first tasks that crashed (especially the log that could have helped you). I thought they had crashed because I had closed those windows (it happened just after that), and also do note that they were not called "java" but the "about" feature in the application on the Mac did mention "java", and you say "You can safely close these (just close the Windows not any icon representing the Java process). " so this is confusing.
ID: 26 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 10 Jun 18
Posts: 39
Credit: 15,776,229
RAC: 0
Message 29 - Posted: 16 Jun 2018, 11:56:54 UTC
Last modified: 16 Jun 2018, 12:01:29 UTC

Hmm if they purge quickly I guess thats why I can't find the one that completed in 1:14 Hr:Min. But on that same PC it shows as having 90 tasks in progress that I do not have on my PC
http://dhep.ga/boinc/results.php?hostid=126

BOINC log does not show downloading more than the single task.
ID: 29 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Jun 18
Posts: 7
Credit: 111,259,937
RAC: 0
Message 34 - Posted: 19 Jun 2018, 15:35:39 UTC

Did we ever get an an answer to the original question? How long should the tasks run normally? A day? A month? A year? I am just trying to understand when things are normal, vs when there is a problem.
Dublin, California
Team: SETI.USA
ID: 34 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Jun 18
Posts: 271
Credit: 0
RAC: 0
Message 39 - Posted: 20 Jun 2018, 14:18:00 UTC - in response to Message 34.  

Hi Zombie,

This is undefined. Roughly between two and four weeks. If you want a more precise answer the task will run as long as the goal being evolved (http://www.dhep.ga/statsgoal.php) remains the same. When the goal changes the task will end and the next task will retrieve the new goal from the server.

Hope that's clearer!.
ID: 39 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 11 Jun 18
Posts: 26
Credit: 7,533,427
RAC: 2
Message 43 - Posted: 20 Jun 2018, 17:09:53 UTC - in response to Message 39.  
Last modified: 20 Jun 2018, 17:10:14 UTC

Hello Michael,

I have 4 tasks idle on the machine (no CPU used) because

Trying to get task from server //rmi.dhep.ga/dubhnahUidhe
java.rmi.ConnectException: Connection refused to host: rmi.dhep.ga; nested exception is:
java.net.ConnectException: Connection timed out: connect
No worries, server is probably being restarted.

(all the time, on a PC)

Thanks.
ID: 43 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PDW
Project administrator

Send message
Joined: 12 Jun 18
Posts: 47
Credit: 163,365,204
RAC: 3
Message 44 - Posted: 20 Jun 2018, 17:24:53 UTC - in response to Message 43.  

Hello Michael,

I have 4 tasks idle on the machine (no CPU used) because

Trying to get task from server //rmi.dhep.ga/dubhnahUidhe
java.rmi.ConnectException: Connection refused to host: rmi.dhep.ga; nested exception is:
java.net.ConnectException: Connection timed out: connect
No worries, server is probably being restarted.

(all the time, on a PC)

Thanks.

My Windows machine is happily connecting and running a task, have you tried removing and re-adding the project again ?

My linux machine is still producing the errors reported here: http://dhep.ga/boinc/forum_thread.php?id=7
I tried removing/re-adding again but they fail after a few seconds.
ID: 44 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Jun 18
Posts: 7
Credit: 111,259,937
RAC: 0
Message 45 - Posted: 20 Jun 2018, 20:13:29 UTC - in response to Message 39.  

Hi Zombie,

This is undefined. Roughly between two and four weeks. If you want a more precise answer the task will run as long as the goal being evolved (http://www.dhep.ga/statsgoal.php) remains the same. When the goal changes the task will end and the next task will retrieve the new goal from the server.

Hope that's clearer!.


Thanks! That really helps.

Are there any issues if we have to quit BOINC, or restart our machines? Will the work be lost? Or will it just pick up where it left off?
Dublin, California
Team: SETI.USA
ID: 45 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Jun 18
Posts: 271
Credit: 0
RAC: 0
Message 47 - Posted: 20 Jun 2018, 20:40:37 UTC - in response to Message 45.  
Last modified: 20 Jun 2018, 20:40:59 UTC

Hi Zombie a Genetic Algorithm will work optimally if islands are stable over a long period of time however your islands population is saved on the server every 15 minutes and will be resumed when you reconnect. Similarly progress related to credits (how many circuits were evaluated) are reported every 15 mintues.

If for any reason you didn't have internet for a whole day this isn't a big issue either as progress will be reported (and you will be credited for all work done) once a reconnection is established.
ID: 47 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 11 Jun 18
Posts: 26
Credit: 7,533,427
RAC: 2
Message 64 - Posted: 24 Jun 2018, 9:09:05 UTC

Hi Michael

your explanations are clear (enough for me ;) ) however I foresee an issue to have a boinc project application task that cast last "from few hours to various weeks", this is called "cohabitation" (with other projects).

The regular cruncher uses to have several projects running on the same machine(s), not necessarily at the same time for those who do "micro management" of their boinc activity (participate to his team actions on one project for a time, pursuing his own personal goals for another time, etc) but most frequently for those who have a more lazy use of boinc, attaching to a list of project and letting "things go on their own" (which is one of the main objectives of boinc for regular people) : being unable to predict a fairly accurate running time might be an issue to other project tasks, leading them to eventually be lost (aborted) or finish overdue (and eventually loose the credit, depending on the projects).

Besides you provide a deadline for your tasks, something like 1 or 2 weeks, so what do you plan to do : extend the deadline automatically ? will it be visible from the boinc manager ? some projects already do the same and cruncher really don't like this, it forces task on "high priority run" (prejudice to other project tasks) and you are never sure if you will actually loose the credit.
ID: 64 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 10 Jun 18
Posts: 5
Credit: 59,848,402
RAC: 0
Message 65 - Posted: 25 Jun 2018, 7:35:24 UTC - in response to Message 47.  
Last modified: 25 Jun 2018, 7:38:35 UTC

Hi Zombie a Genetic Algorithm will work optimally if islands are stable over a long period of time however your islands population is saved on the server every 15 minutes and will be resumed when you reconnect. Similarly progress related to credits (how many circuits were evaluated) are reported every 15 mintues.

If for any reason you didn't have internet for a whole day this isn't a big issue either as progress will be reported (and you will be credited for all work done) once a reconnection is established.


I had to shut 1 of my box's off & turn it back on again. When I turned it back on all 8 of the DHEP Wu's running started from 0% Progress again ... Of course that don't mean much since the Wu's that have been running on other Box's for more than 5 days now show 0% Progress too ...
ID: 65 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Jun 18
Posts: 271
Credit: 0
RAC: 0
Message 68 - Posted: 25 Jun 2018, 14:30:59 UTC - in response to Message 65.  

Hello more on the current plan for credits and tasks.


    o Tasks are run at the standard BOINC priority so cohabitation in that sense is fine
    o Under the current scheme BOINC tasks will never complete nor progress. Your progress is logged by the DHE servers every 15 minutes and credits allocated accordingly. You are free to jump in and out of a DHE task as you wish, though the GA is more effective with stable islands.
    o Again, credits are not related to tasks, they are generated by the DHE server as soon as your client connects every 15 minutes. So these leader boards: http://www.dhep.ga/statsrankings.php and http://www.dhep.ga/statsdailyrankings.php and now also https://stats.free-dc.org/proj/dhe will show your credits. Also you can see what your island / task / BOINC client is up to here http://www.dhep.ga/statstopology.php and to some degree here http://www.dhep.ga/statsstrains.php .
    o To make it even clearer, credits are allocated throughout task activity without requiring task completion.

ID: 68 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 11 Jun 18
Posts: 26
Credit: 7,533,427
RAC: 2
Message 71 - Posted: 26 Jun 2018, 20:55:21 UTC

Clearer, indeed.

This does apply to any client / any OS ?

Also here this is me : http://www.dhep.ga/statsisland.php?island_id=6081 (there is a "total effort" : meaning ? )
Does this correspond to any Boinc credit, somewhere ?
ID: 71 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Jun 18
Posts: 271
Credit: 0
RAC: 0
Message 72 - Posted: 26 Jun 2018, 22:16:36 UTC - in response to Message 71.  
Last modified: 26 Jun 2018, 22:17:20 UTC

This does apply to any client / any OS ?

Yes - (as long as you've reset your client in the last couple of days).

(there is a "total effort" : meaning ? )
Does this correspond to any Boinc credit, somewhere ?

Correct. This metric is scaled to BOINC credit so that it ends up being the usual 20 per hour/core.

Total effort itself is a calculation based on circuit evaluations * circuit evaluation cost which itself is determined by the size of circuit and its input count.
ID: 72 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Conan

Send message
Joined: 25 Jun 18
Posts: 24
Credit: 19,731,151
RAC: 1
Message 77 - Posted: 27 Jun 2018, 0:46:21 UTC - in response to Message 65.  
Last modified: 27 Jun 2018, 0:49:42 UTC

Hi Zombie a Genetic Algorithm will work optimally if islands are stable over a long period of time however your islands population is saved on the server every 15 minutes and will be resumed when you reconnect. Similarly progress related to credits (how many circuits were evaluated) are reported every 15 mintues.

If for any reason you didn't have internet for a whole day this isn't a big issue either as progress will be reported (and you will be credited for all work done) once a reconnection is established.


I had to shut 1 of my box's off & turn it back on again. When I turned it back on all 8 of the DHEP Wu's running started from 0% Progress again ... Of course that don't mean much since the Wu's that have been running on other Box's for more than 5 days now show 0% Progress too ...


When you restart BOINC Client for any reason, the hours reset back to Zero, so I hope all the data gets saved back to the Project servers as we appear to lose it on our computers.
It would be nice (if it is practical) to see some progress made on a work unit, but I suppose if it unknown what is going to happen then I don't know how you would work out the progress.

I doubt Gridcoin people will worry with this project as they get paid (so I believe) on their RAC, and we have none on this project.
The credit just appears on your total, not on a per host basis, so you can't work out where the credit came from, not that is a big issue.

Conan
ID: 77 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael DHEP
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Jun 18
Posts: 271
Credit: 0
RAC: 0
Message 79 - Posted: 27 Jun 2018, 4:26:20 UTC - in response to Message 77.  
Last modified: 27 Jun 2018, 4:26:40 UTC


When you restart BOINC Client for any reason, the hours reset back to Zero, so I hope all the data gets saved back to the Project servers as we appear to lose it on our computers.

All data is saved at the servers every 15 minutes.


It would be nice (if it is practical) to see some progress made on a work unit, but I suppose if it unknown what is going to happen then I don't know how you would work out the progress.

Exactly. If you're interested you can monitor progress of the science side of the project at http://www.dhep.ga/statsgoal.php


I doubt Gridcoin people will worry with this project as they get paid (so I believe) on their RAC, and we have none on this project.
The credit just appears on your total, not on a per host basis, so you can't work out where the credit came from, not that is a big issue.

We'll keep working on these and see what we can do. Thanks for the heads up on this. For now we're hoping we're at the stage that:
1. BOINC people can join DHE on Linux/Mac/Windows
2. You receive credits fairly
ID: 79 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Theadalus

Send message
Joined: 10 Mar 19
Posts: 6
Credit: 247,248,885
RAC: 6
Message 1019 - Posted: 23 Jun 2019, 23:27:15 UTC

*Bump*

Hi,

Normally if I want to stop a machine participating a certain project (move to different project), i set the BOINC client status to "Won't get new tasks", and run the buffers empty in a "proper" way.

As far as I understand (correct me if i'm wrong), each wu stands for calculations/simulations on an particular "Island", and the runtime can be indefinitely. So running the buffers empty can take months?
As calculations are saved to the server every hour, what will be the effect when I abort an wu, will it affect the outcome on an Island, will calculations be picked up/resumed by another client, or become useless?

What is a proper way to (temporary) stop a machine, besides 'NOT'? ;-)

I also noticed when wu's are running for longer period, the cpu load becomes less, e.g.:
I have a host (ID: 17916) which is running 16 wu's; 5 wu's are running for 4d:11h with cpu load = 99.9%, 11 wu's are running 18d:14h with cpu load = 77.2%., while in beginning these 11 wu's also started at 99.9% cpu load?
This is found consistent on all my machines.
ID: 1019 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : How long should tasks run?


©2019