Rubyforge.org down

RubyForge is down... investigating now.

It's back up now. We may be having hardware issues - the machine seems
to be just halting without writing anything to the system log. I'm
somewhat flummoxed.

Yours,

Tom

My high school programming teacher had a saying for this... "Man I hate computers." :wink:

James Edward Gray II

···

On Oct 10, 2006, at 3:52 PM, Tom Copeland wrote:

RubyForge is down... investigating now.

It's back up now. We may be having hardware issues - the machine seems
to be just halting without writing anything to the system log. I'm
somewhat flummoxed.

full parition?

-a

···

On Wed, 11 Oct 2006, Tom Copeland wrote:

RubyForge is down... investigating now.

It's back up now. We may be having hardware issues - the machine seems
to be just halting without writing anything to the system log. I'm
somewhat flummoxed.

--
my religion is very simple. my religion is kindness. -- the dalai lama

Tom Copeland wrote:

It's back up now. We may be having hardware issues - the machine seems
to be just halting without writing anything to the system log. I'm
somewhat flummoxed.

Check into OS-hardware incompatibilities. My motherboard causes the same problem with Linux due to some weird-ass chipset timing issue (until you `athcool off` or whatever). Granted, I doubt you're running on a 4 year old nForce 2, but, you know...... linux sucks *runs*

*runs back* Seriously, though, Google. *runs*

> It's back up now. We may be having hardware issues - the machine
> seems
> to be just halting without writing anything to the system log. I'm
> somewhat flummoxed.

My high school programming teacher had a saying for this... "Man I
hate computers." :wink:

Yup, it's getting to the point where suggestions like "let's replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

Yours,

Tom

Nah, it's just got two partitions - /boot and /. / has about 600 GB
free, so no problem there :slight_smile:

Yours,

Tom

···

On Wed, 2006-10-11 at 06:51 +0900, ara.t.howard@noaa.gov wrote:

On Wed, 11 Oct 2006, Tom Copeland wrote:

>> RubyForge is down... investigating now.
>
> It's back up now. We may be having hardware issues - the machine seems
> to be just halting without writing anything to the system log. I'm
> somewhat flummoxed.

full parition?

What kind of a server do you have in mind? I may be able to help. -Tim

···

On Oct 10, 2006, at 2:08 PM, Tom Copeland wrote:

Yup, it's getting to the point where suggestions like "let's replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

Yup, it's getting to the point where suggestions like "let's replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

I had RAM go bad on one of our home PC's last year. It does
take awhile to get to that "let's replace all the RAM chips"
threshold, doesn't it. I forget all the troubleshooting steps
I'd taken before arriving there, but they were many. The problem manifested itself in such weird ways. It got to the
point where I wanted to download and re-apply the microsoft
service packs (this was a win2k box), and when I'd download
the 30MB file from the networked PC upstairs, it wouldn't
extract properly, claiming it was corrupted. I forget all
the diagnostics I tried (other symptoms were random crashes
of applications or the whole OS)... eventually after replacing IDE cables and I think even the boot/system hard
drive, I realized, jesus it might well be the RAM. Turned out it was! The last time I'd had RAM go bad was about 14
years earlier on an Amiga development system. :slight_smile:

Good luck!

If your motherboard is flexible with RAM configuration and
you have multiple DIMMs, maybe you'll be able to try
swapping out some DIMMs and hopefully end up isolating the bad one. (That's what I did, but the system only had
two DIMMs, so it was easy.)

Regards,

Bill

···

From: "Tom Copeland" <tom@infoether.com>

Yup, it's getting to the point where suggestions like "let's replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

Is the filesystem ReiserFS?

Martin

Oh, the current machine can handle the load just fine; I'm just
wondering if flaky RAM may be causing the problems.

Yours,

Tom

···

On Wed, 2006-10-11 at 06:21 +0900, Tim Bray wrote:

On Oct 10, 2006, at 2:08 PM, Tom Copeland wrote:

> Yup, it's getting to the point where suggestions like "let's
> replace all
> the RAM chips" are starting to sound reasonable. Although I hate to
> pester RubyCentral for more hardware money...

What kind of a server do you have in mind? I may be able to help. -Tim

Bill Kelly wrote:

From: "Tom Copeland" <tom@infoether.com>

Yup, it's getting to the point where suggestions like "let's replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

I had RAM go bad on one of our home PC's last year. It does
take awhile to get to that "let's replace all the RAM chips"
threshold, doesn't it. I forget all the troubleshooting steps
I'd taken before arriving there, but they were many. The problem
manifested itself in such weird ways. It got to the
point where I wanted to download and re-apply the microsoft
service packs (this was a win2k box), and when I'd download
the 30MB file from the networked PC upstairs, it wouldn't
extract properly, claiming it was corrupted. I forget all
the diagnostics I tried (other symptoms were random crashes
of applications or the whole OS)... eventually after replacing IDE
cables and I think even the boot/system hard
drive, I realized, jesus it might well be the RAM. Turned out it was!
The last time I'd had RAM go bad was about 14
years earlier on an Amiga development system. :slight_smile:

Good luck!

If your motherboard is flexible with RAM configuration and
you have multiple DIMMs, maybe you'll be able to try
swapping out some DIMMs and hopefully end up isolating the bad one.
(That's what I did, but the system only had
two DIMMs, so it was easy.)

memtest86 is your friend ... although you need another server to run
your apps while memtest86 is grinding away on the suspect one.

Thanks! Maybe I should try some of those extended memory checks - as Ed
said, though, that'll take RubyForge down for a few hours. It would be
an improvement on these last few days of sporadic crashes, though...

Yours,

Tom

···

On Wed, 2006-10-11 at 11:51 +0900, Bill Kelly wrote:

I had RAM go bad on one of our home PC's last year. It does
take awhile to get to that "let's replace all the RAM chips"
threshold, doesn't it. I forget all the troubleshooting steps
I'd taken before arriving there, but they were many. The
problem manifested itself in such weird ways. It got to the
point where I wanted to download and re-apply the microsoft
service packs (this was a win2k box), and when I'd download
the 30MB file from the networked PC upstairs, it wouldn't
extract properly, claiming it was corrupted. I forget all
the diagnostics I tried (other symptoms were random crashes
of applications or the whole OS)... eventually after
replacing IDE cables and I think even the boot/system hard
drive, I realized, jesus it might well be the RAM. Turned
out it was! The last time I'd had RAM go bad was about 14
years earlier on an Amiga development system. :slight_smile:

Good luck!

Nope, ext3.

Yours,

tom

···

On Thu, 2006-10-12 at 17:03 +0900, Martin Coxall wrote:

>
> Yup, it's getting to the point where suggestions like "let's
> replace all
> the RAM chips" are starting to sound reasonable. Although I hate to
> pester RubyCentral for more hardware money...
>

Is the filesystem ReiserFS?

Tom Copeland wrote:

Oh, the current machine can handle the load just fine; I'm just
wondering if flaky RAM may be causing the problems.

So now let me get this straight...you're not interested in potential new hardware? I'll take it if you don't want it! :slight_smile:

···

--
Charles Oliver Nutter, JRuby Core Developer
headius@headius.com -- charles.o.nutter@sun.com
Blogging at headius.blogspot.com

Tom, I think you missed that joke:

:wink:

James Edward Gray II

···

On Oct 12, 2006, at 8:30 AM, Tom Copeland wrote:

On Thu, 2006-10-12 at 17:03 +0900, Martin Coxall wrote:

Yup, it's getting to the point where suggestions like "let's
replace all
the RAM chips" are starting to sound reasonable. Although I hate to
pester RubyCentral for more hardware money...

Is the filesystem ReiserFS?

Nope, ext3.

Ah, this machine's a good 'un... just needs an oil change. Or maybe the
tires rotated...

Yours,

Tom

···

On Wed, 2006-10-11 at 13:41 +0900, Charles Oliver Nutter wrote:

Tom Copeland wrote:
> Oh, the current machine can handle the load just fine; I'm just
> wondering if flaky RAM may be causing the problems.

So now let me get this straight...you're not interested in potential new
hardware? I'll take it if you don't want it! :slight_smile:

>> Is the filesystem ReiserFS?
>
> Nope, ext3.

Tom, I think you missed that joke:

The Future of ReiserFS - Slashdot

:wink:

Yikes!! Yeah, I think we'll stick with ext3... egads.

Tom

Why not setup a clustered setup for some redundancy? I mean, if a
certain well known corporation was willing to donate resources to open
source projects...

- Rob

···

On 10/11/06, Tom Copeland <tom@infoether.com> wrote:

On Wed, 2006-10-11 at 13:41 +0900, Charles Oliver Nutter wrote:
> Tom Copeland wrote:
> > Oh, the current machine can handle the load just fine; I'm just
> > wondering if flaky RAM may be causing the problems.
>
> So now let me get this straight...you're not interested in potential new
> hardware? I'll take it if you don't want it! :slight_smile:

Ah, this machine's a good 'un... just needs an oil change. Or maybe the
tires rotated...

Yours,

Tom

my experience is that each percentage point of uptime increase adds
exponential effort for the maintainers. clustered setups are insanely
complicated with any application that has state - which rubyforge does.
people think it's straightforward, but i can't tell you how many boxes we've
seen that were 'highly-available' which ignored situations like split-brain,
failback, etc. and therefore merely threw money and time at a problem without
increasing reliability... lazy replication and all the quick fixes cause more
problems than they solve - to do it right one needs good shared storage with
serious hardware redundancy and remote controlable stonith devices for prevent
split brain. a modest setup would start at 50k depending on storage needs.
some of our systems are more like 300k - and that's for one pair.

anyhow, i think a more reasonable approach is

   a) minimize downtime. rubyforge is already there, i'm sure it's uptime is
   above 98%

   b) run hot-warm with manual failover and failback. this is cheap and a
   helps 'a' greatly.

   c) drb (distributed block device) could drastically reduce cost and
   eliminate the need for shared storage - but i have no experience with it...

sorry for OT rant...

ciao.

-a

···

On Thu, 12 Oct 2006, Rob Sanheim wrote:

On 10/11/06, Tom Copeland <tom@infoether.com> wrote:

On Wed, 2006-10-11 at 13:41 +0900, Charles Oliver Nutter wrote:
> Tom Copeland wrote:
> > Oh, the current machine can handle the load just fine; I'm just
> > wondering if flaky RAM may be causing the problems.
>
> So now let me get this straight...you're not interested in potential new
> hardware? I'll take it if you don't want it! :slight_smile:

Ah, this machine's a good 'un... just needs an oil change. Or maybe the
tires rotated...

Yours,

Tom

Why not setup a clustered setup for some redundancy? I mean, if a
certain well known corporation was willing to donate resources to open
source projects...

--
my religion is very simple. my religion is kindness. -- the dalai lama

Why not setup a clustered setup for some redundancy? I mean,
if a certain well known corporation was willing to donate
resources to open source projects...

Yeah, we may have to do something like this. The added admin load makes
me cringe, though... but, maybe it wouldn't be too bad...

Yours,

Tom