Some linux machines communicate ok others not so.

Message boards : Number crunching : Some linux machines communicate ok others not so.

To post messages, you must log in.

AuthorMessage
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1765 - Posted: 13 Oct 2010, 14:21:31 UTC

I have found that some of my linux machines connect to Enigma fine and get w/u's. Others cannot even connect to get the first w/u.
They get.
Wed 13 Oct 2010 03:03:50 PM BST Enigma@Home Sending scheduler request: Project initialization.
Wed 13 Oct 2010 03:03:50 PM BST Enigma@Home Requesting new tasks
Wed 13 Oct 2010 03:03:53 PM BST Project communication failed: attempting access to reference site
Wed 13 Oct 2010 03:03:54 PM BST Internet access OK - project servers may be temporarily down.
Wed 13 Oct 2010 03:03:55 PM BST Enigma@Home Scheduler request failed: Server returned nothing (no headers, no data)


Now the good machines get:-
Wed 13 Oct 2010 02:49:06 PM BST Enigma@Home Sending scheduler request: To fetch work.
Wed 13 Oct 2010 02:49:06 PM BST Enigma@Home Requesting new tasks
Wed 13 Oct 2010 02:49:10 PM BST Project communication failed: attempting access to reference site
Wed 13 Oct 2010 02:49:11 PM BST Enigma@Home Scheduler request failed: Server returned nothing (no headers, no data)
Wed 13 Oct 2010 02:49:11 PM BST Enigma@Home Sending scheduler request: To fetch work.
Wed 13 Oct 2010 02:49:11 PM BST Enigma@Home Requesting new tasks
Wed 13 Oct 2010 02:49:12 PM BST Internet access OK - project servers may be temporarily down.
Wed 13 Oct 2010 02:49:16 PM BST Enigma@Home Scheduler request completed: got 35 new tasks

Almost the same except for the final line .... they get work.

The windoz machine is fine.
Anybody else get this. It seems random on which machine it works on.

Nairb
ID: 1765 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1803 - Posted: 2 Nov 2010, 20:55:17 UTC

One of my linux machines has never had any work. On the computer details page it is shown as (Computer 36554) and has 64 outstanding tasks. The page shows that "Number of times BOINC has contacted server" is zero but that the "Last time contacted server" was "2 Nov 2010 11:39:30 UTC".
This machine has never had a single w/u and does not seem to make a sucessfull connection to the server.

Is this a server bug too?

Ta
Nairb
ID: 1803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 70,812,672
RAC: 391,498
Message 1805 - Posted: 2 Nov 2010, 23:08:45 UTC

Maybe, I'll check that after I fix several other issues.
M4 Project homepage
M4 Project wiki
ID: 1805 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1810 - Posted: 4 Nov 2010, 17:29:40 UTC

Since yesterday computer 33159 has not been able to send back 33 results (due on the 3rd Nov)
The account details for this machine also says that I have 47 jobs live and due on the 5th Nov. In fact the machine has none left to do.

The following is all that I get when trying to communicate with the server.

Thu 04 Nov 2010 05:08:56 PM GMT Enigma@Home update requested by user
Thu 04 Nov 2010 05:08:58 PM GMT Enigma@Home Sending scheduler request: Requested by user.
Thu 04 Nov 2010 05:08:58 PM GMT Enigma@Home Reporting 33 completed tasks, not requesting new tasks
Thu 04 Nov 2010 05:09:02 PM GMT Project communication failed: attempting access to reference site
Thu 04 Nov 2010 05:09:03 PM GMT Enigma@Home Scheduler request failed: Server returned nothing (no headers, no data)
Thu 04 Nov 2010 05:09:04 PM GMT Internet access OK - project servers may be temporarily down.


Not able to get new work either

Ta
ID: 1810 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1818 - Posted: 10 Nov 2010, 20:20:18 UTC

Non of my machines that use a proxy server to get to the outside world can now communicate with Enigma. The only machine that has as a direct connection is fine. Other projects on these machines work fine. Looks like there is a problem with Enigma communication and using a proxy server.

Nairb
ID: 1818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Graeme of Boinc UK

Send message
Joined: 11 Oct 07
Posts: 29
Credit: 12,503,305
RAC: 0
Message 1821 - Posted: 10 Nov 2010, 23:45:49 UTC

Not exclusive to Linux, I have three machines that can not upload due to a "transient upload error" & two are Windows machines.
I have a 5 day logjam of work to upload.
ID: 1821 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 70,812,672
RAC: 391,498
Message 1822 - Posted: 10 Nov 2010, 23:46:07 UTC
Last modified: 10 Nov 2010, 23:49:10 UTC

It's not a problem with proxy, but the fact that the domain name points to nowhere.
My hosting account used to park domain and as download mirror used up all it's resources and right now is disabled.
M4 Project homepage
M4 Project wiki
ID: 1822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Graeme of Boinc UK

Send message
Joined: 11 Oct 07
Posts: 29
Credit: 12,503,305
RAC: 0
Message 1823 - Posted: 11 Nov 2010, 0:24:00 UTC - in response to Message 1822.  

In the meantime I have workunits, lots of them, that are now becoming time expired.
Do I shut down these machines until this problem is resolved or continue on the understanding that the expiry date will be extended to allow for this problem?
I am not the only one having this problem TJM.
ID: 1823 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1824 - Posted: 11 Nov 2010, 0:31:40 UTC

Ok, I just started up a linux machine which uses a proxy server I attached to Enigma. The connection failed with the same messages as already posted in this thread. I then connected my serial modem and removed the proxy option from boinc and tried a direct connection. I now get 175 w/u fine and they are down loading ok (bit slow).
Dont understand the proxy issue - but it works fine for direct connections.
My windows machine with direct connection is able to sent/get data fine.

Nairb
ID: 1824 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 70,812,672
RAC: 391,498
Message 1825 - Posted: 11 Nov 2010, 9:10:22 UTC

Adding 69.12.216.209 enigmaathome.net and 69.12.216.209 www.enigmaathome.net to host file fixes the problem, when the domain will go back online I don't know, not today that's for sure.

M4 Project homepage
M4 Project wiki
ID: 1825 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Graeme of Boinc UK

Send message
Joined: 11 Oct 07
Posts: 29
Credit: 12,503,305
RAC: 0
Message 1829 - Posted: 12 Nov 2010, 23:15:50 UTC

Thanks TJM, it worked a treat.

Shame that a large number of workunits are "Completed, marked as invalid".
Should this happen again then I will abort those workunits close to timeout as my electricity does not come cheap.

Regards,
Graeme.
ID: 1829 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
root

Send message
Joined: 11 Aug 10
Posts: 10
Credit: 840,315
RAC: 0
Message 1832 - Posted: 13 Nov 2010, 19:22:18 UTC

Yes, its worked for me too.

Ta

Nairb
ID: 1832 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 70,812,672
RAC: 391,498
Message 1833 - Posted: 13 Nov 2010, 19:30:27 UTC - in response to Message 1829.  

Thanks TJM, it worked a treat.

Shame that a large number of workunits are "Completed, marked as invalid".
Should this happen again then I will abort those workunits close to timeout as my electricity does not come cheap.

Regards,
Graeme.


Could you post a link to invalid workunits on workunit/result page ?
The server logs all invalid WUs and I saw maybe 10 of them during last 24 hours.


M4 Project homepage
M4 Project wiki
ID: 1833 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Graeme of Boinc UK

Send message
Joined: 11 Oct 07
Posts: 29
Credit: 12,503,305
RAC: 0
Message 1834 - Posted: 13 Nov 2010, 21:19:46 UTC - in response to Message 1833.  

ID: 1834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 70,812,672
RAC: 391,498
Message 1842 - Posted: 17 Nov 2010, 1:14:11 UTC

There are only 2 invalid results due to output file corruption, the rest with validate errors were marked invalid by broken daemons long after they were first validated, and the granted_credit field clearly shows non-zero value.
If the result is marked invalid and *it is* invalid, usually there's an explanation in stderr_out added by validator.

M4 Project homepage
M4 Project wiki
ID: 1842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Some linux machines communicate ok others not so.




Copyright © 2017 TJM