PDA

View Full Version : The saga of misery continues


Valkyrien
01-29-2003, 05:36 AM
OK, I'm gonna stop hijacking mucksmear's thread and start my own. For those who haven't been watching this thread (http://www.cgtalk.com/showthread.php?s=&threadid=36386), just click the link and read about the evolution of the problems with my new computer components.

Now that everyone's up to speed, here's the latest update.

I got ahold of a new XP CD. This is to test my theory that neither my CD drives or my CDs are the source of my problems. It seems unlikely that two separate CDs have the exact same problem with the exact same file (ntkrnlmp/exe) and even more unlikely that three CD drives purchased over a period of several years have suddenly developed the same problem at the exact same time.

So, I boot from the new CD, and get the same problems I've been getting all along. Hardly surprising by this point, but still discouraging. I try it with yet another CD-ROM, this one brand new. Same damn error. Ok, so I've effectively proven that neither the CD Drives or the discs are to blame.

I take my HD out and put it into another computer, with the idea that maybe i can install XP on it from that computer. I boot from CD. It goes off without a hitch. Windows starts, and runs fine. Hooray, i now have a formatted HD with an OS on it. I am happy now. That happiness was short-lived however, because when I put the HD back in my computer and started it up, I got this error: "Windows could not start because the folowing file is either corrupt or missing: <Windows Root>/system32/ntoskrnl.exe. Please reinstall the file." But how? it's not as though I can selectively go through the setup files and install one particular file.

So I say to myself "something still looks fishy." After all, Windows did start up just fine on the other computer. So I put my HD back in that machine and start up. i go to system32. Lo and behold, ntoskrnl.exe is there right where it should be. It seems to be in perfect condition too. It can't be run in Win32 mode, but that's completely normal. What a damn shocker that my computer is giving me a false alarm that prevents me from using it. :rolleyes: So I think to myself that maybe the file can still be corrupt without any outward sign of it. So I do a full reinstall. That finished, i put the HD back in my computer. There should be nothing standing in my way. Naturally though, i get the ntoskrnl.exe error again. The cursing starts. I give up and go to bed.

The next day, it dawns upon me. Maybe there's a problem with the BIOS that causes these errors! Seems logical enough. I format a startup disk. I go to www.asus.com and grab AFLASH and the latest BIOS revision. I put them on the disk. I test the disk on another computer to make sure that it will boot from it. It does. I think "oh joy, this may all be over soon." How wrong was I... I put the disk in, i start the computer. I get an error (NOOOOOOO!!) ERROR BOOTING FROM DISK. REPLACE WITH SYSTEM DISK AND PRESS ENTER. I check the disk on another computer, and it runs fine. I try it again on mine. Same error again. "well **** you, you stupid piece of shit!"

The past couple of weeks may be the undoing of my tenuous grip on sanity.

Does anyone who read though all that have any brilliant ideas that don't involve dealing with unnecessarily condescending tech support people?

dvornik
01-29-2003, 05:47 AM
It could be a cabling problem. We had a couple of Dell Dimensions that behaved that way cause the cable got jammed when you close the case. Your MB has difficulty communicating with drives therefore. See if this helps, like replace all cables and all.

Also, I wasn't following the thread , but the IDE controller may be too old for the drive.

[edit] Never mind, I read your other thread, will think of it.

GregHess
01-29-2003, 12:43 PM
I agree with dvornik. Here's some tips to go through once again.

If your encountering any computer issues...always...

1) Check all cables in the case. Make sure their firmly connected, and nothing is pulling or hanging on the cables. If you have spares, consider replacing them temporarily (even lower speed IDE cables work fine) to help eliminate possible cable issues as a variable.

a) This is especially true with rounded cables. I've encountered a number of issues with IDE RAID arrays and rounded cables...sometimes some interference would occur along the cable line, and cause some communication/disk errors to occur, fubaring the raid. If you are using rounded cables, make sure you get the models with ground wires, and shielding for the entire cable length. (Coolermaster makes a model which is both grounded against EMI, and has ground wires spaced between data wires).

2) Remove all the ram sans the amount necessary to run the system. (If its a dual channel system, then 2 dimms, otherwise have just 1 in Dimm bank 0/1) Run memtest86 (www.memtest86.com) for a single pass, with ECC disabled, all tests. (Press esc to configure it). If you get a SINGLE error, try increasing the ram timings, and run the test again. If the error repeats, its a bad dimm and needs replacement.

a) Remember that ram errors are the cause of much grief in a machine. They can cause all types of system errors and are quite possibily the "MOST easily" damaged component. Just because the ram ran fine before, doesn't mean it'll run find after you dropped the computer down the stairs.

3) In the bios, check both the CPU Temps, and the voltage variations. If you see the voltage varying by more then .05-.1 on the cpu voltage, it could be a damaged or bad PSU. Voltage variances both in the ram, agp, and cpu (as well as the rails) can cause a variety of problems, including lockups, ram errors, and hardware damage.

4) Check the motherboard for visable signs of damage. I just recently had a cockroach get into a Asus A7V333 system and fried the motherboard. Damn little bugger. Sometimes a motherboard will continue to function with slight damage levels (scratched pcb's for example), but may have issues executing certain operations. (Ex. A piece of metal fell on my A7M266-D and fried the bottom PCI slot...you can even see the scorch mark...but damn if that thing hasn't crashed once in the past 16 months. Just as long as I don't use the bottom pci slot its fine :) )

Er thats some random tips to try and get around this problem again.

Valkyrien
01-29-2003, 04:54 PM
hmm, well as for cable problems, the Motherboard doesn't seem to have any problems detecting my drives, since they all show up on my boot list and the list of IDE Devices.

And methinks the RAM is fine. The system runs a memory check, and it actually shows slightly more than I have...

GregHess
01-29-2003, 05:05 PM
The system memory check is not memtest. It will not detect errors, and will not reveal any problems whatsoever with the ram. It is nothing more then a test to make sure the physical ram is detected.

I seriously recommend running memtest86. You might solve all your problems just by running the test...or you can try a thousand other solutions, and eventually find out it might just be the ram. Its up to you.

Also its usually good idea if you "think" something, to go ahead and make sure. Otherwise you can end up following the same troubleshooting paths in different directions, and different ways, and spend weeks working on a problem, instead of attacking it in a concise pattern, with verifications along the way.

Ex.

1) Check cables. Replace cables. Hmm doesn't seem to be cables, cables ok. Check.

2) Check ram, run mem test. Hmm...ram seems ok. Check.

3) Power supply troubleshooting, etc etc.

Maven
01-30-2003, 12:49 PM
I had this problem not to long ago with win2k. Take out ALL unnessesary devices...sound, ethernet card, all you expansion cards, extra harddrives and optical drives. Just leave the basic components to run a computer. Then do a clean install of XP with your machine.

jscott
01-30-2003, 01:55 PM
To reinforce Greg's last post.

Sometimes it helps me to remember the following:

After checking a few obvious things realize that you don't know what the problem is and any of your perceptions about the problem could be totally wrong. The next logical approach is not to keep trying to find out what the problem is, but rather to find what's not the problem so you can narrow your search.

I hope it works out for you.

-jscott

kwshipman
01-30-2003, 02:52 PM
Originally posted by GregHess
4) Check the motherboard for visable signs of damage. I just recently had a cockroach get into a Asus A7V333 system and fried the motherboard. Damn little bugger. Sometimes a motherboard will continue to function with slight damage levels (scratched pcb's for example), but may have issues executing certain operations. (Ex. A piece of metal fell on my A7M266-D and fried the bottom PCI slot...you can even see the scorch mark...but damn if that thing hasn't crashed once in the past 16 months. Just as long as I don't use the bottom pci slot its fine :) )

Er thats some random tips to try and get around this problem again.

Please tell me that was at your school, not at home:eek:

The being damaged part I agree with, I made a small chip on the mother board and one of the ram slotts (0,1) does not work, luckly I was able to use the third bank after getting error messages.

GregHess
01-30-2003, 02:54 PM
Kw,

I'm the admin for the entomology department....hehe bug city :).

I've been installing makeshift charcoal filters now on a variety of the machines to block front access...unfortuantly they can still squeeze through the gaps in the case frame. (Between the door and frame)

kwshipman
01-30-2003, 03:06 PM
yea, I couldnt remember how to spell entomology, this place needs spell-check. Have a funny bug story, but we hijacked this thread enough, so maybe another time....now that I think about it, it's not that funny.

An Erased One
01-30-2003, 03:39 PM
I had a similar problem with Win2K (I think it was also the ntoskrnl.exe or so), may be something completely different, as I didn't read the other thread, but -
are the CD-ROM and the HD on the same IDE cable? That kept me from re-installing for 2 weeks, and I still don't have the slightest idea why, it just doesn't work on my PC that way.
I just had to use 2 different IDE Channels, and it worked.

Gyan
01-30-2003, 04:06 PM
Methinks the chip/circuits responsible for drive I/O is screwed on your motherboard.

MadMax
01-30-2003, 04:35 PM
Okay, approach this logically......

You have made certain your CD is fine as you have another system to test with.

have you tried the drives and cables from the new machine in your old one just to be certain that is not a problem? I sure would. Always eliminate problems even if you don't think it likely. being adamant that soemthing couldn't possibly be wrong in one area will almost gaurantee that after weeks of beating ones head against the wall, wiull turn out to be the problem.

Once you eliminate cables and drives as problems, test the other components as well in your old machine.

Memtest86 is free. For gods sake download it and use it. Brand new ram doesn't mean it can't be bad. I have had machines that worked 100% flawlessy on virtually all apps, but just had one app that would crash. Guess what? memtest found errors......

Also, try posting your info again. This is your thread, start from the top. Also, I'm too lazy to wade through 100+ messages to get your particulars.

webfox
01-30-2003, 06:44 PM
Have you tried removing the battery from your mainboard and starting over after a few minutes?

It does sound like something's wrong with the board, though. I'm just thinking that if you reset the system and try again, you can eliminate some variables.

Also, make sure that you don't have "Plug and Play OS" checked in your BIOS. That usually makes the BIOS look for other boards, like SCSI or PCI IDE that may have a system disk attached to them and try to boot from them, instead. You don't want that. You want it to boot straight from the main IDE controller.

Just a thought.

Good luck.

DaForce
01-31-2003, 12:10 AM
IMHO i will be the ram.

I had the same sort of problem in my spare computer. P3 500 256mb ram (2x 128mb Dimms). I gave it too my GF to use at home, and she was complaining that every so often it would lockup. Then the computer would not even post sometimes. SO i took it back to my place and gave the OS the once over (win2k) everything seemed ok. Then one day when i turned it on it got to the windows login screen, and then this weird ass window came up, the likes of I have NEVER seen before. And it said a whole bunch of stuff about fatal system errors and cant read from this file and that file. It was very different to any other window i have seen, it was different shape size and color. I dubbed it "The Error Window of Death".

Anywayz, i ran memtest, found that one of my ram dimms had crapped it self, so i took it out, and now everything is peachy. apart from the fact that its running only 128mb....thats a real punch in the guts.

I really should get some more ram for that machine.

p.s. i didnt have to reinstall windows either.

Check your ram!!!

singularity2006
01-31-2003, 02:01 AM
yah, i hellza agree. I've had RAM be the sole issue in most of my problems where everything else has been exhausted...

KayosIII
01-31-2003, 04:47 AM
I suspect the reason that the transplanted system did not boot was because you have not put the hardrive in the same physical location as it was in the computer you installed in...

If the hard drive was in primary slave in the machine you installed XP in you will need to have the hard-drive in primary slave in the new machine...

This is to do with the way that the XP bootloader works...

The only other advice I can give is to remove all non essential hardware and try installing again. I have had sound cards prevent OS installation in past.

singularity2006
01-31-2003, 06:43 AM
last time i got an error related to IDE channels, it was because I had the jumpers set wrong on two of my devices. My Zip100 has two settings for slace including a jumper in one place or leaving all jumpers out. Then the HD itself has two settings for primary master: as a single drive, or master w/ secondary device... that was the one thing that kept my win2K from installing right for quite a lot time too...

Valkyrien
02-01-2003, 03:13 AM
I haven't changed the physical locations of the drives:) It's been suggested that I replace my unregistered PC2700 DDR with registered PC2100 DDR, simply for the reason that registered is better, and the board won't run RAM at more than 2100 anyway, so that might be a source of problems. I'll also try removing all that hardware...there are only two open PCI slots ATM;)

Valkyrien
02-01-2003, 03:15 AM
oh, i realized that I already eliminated the cables as a source of the problem by hooking the drives and everything back up to the old machine using said cables when I was trying to rescue my files way back two weeks ago:)

DaForce
02-01-2003, 03:40 AM
Dude, Stop everything, go and download memtest, burn it on to a cd. Boot from that cd and then test your RAM. Its a very easy thing to do, and it will most likely show you what is causing your problems.

Dont buy more ram yet, just run mem test to rule out the ram problem.

no buts, ifs or whys, just do it.

GregHess
02-01-2003, 04:09 AM
Valk...

Its been how many days now since we originally suggested a memtest??? Have you run a memtest yet? Are you planning too? If your not, I suggest stop asking for assistance, cause it doesn't seem like you want any advice.

Not to be rude, but I think when a dozen or so people suggest running memtest, it might be a good idea to possibly listen to them.

Valkyrien
02-01-2003, 04:20 AM
I've spent very little time in my room over the past few days, no more than a couple hours with the exception of sleeping, and the little waking time I have there, i have more important things to do than deal with the damn computer. Which isn't to say I'm not grateful for all the suggestions, I just haven't had much chance to implement them (I'm at home for the weekend, so it'll be a couple of days yet. First thing when i get the chance is memtest.:)

Sieb
02-01-2003, 06:24 AM
Well for one, transplanting a HDD with an OS on it won't work period. Windows sets up the OS with hardware specific information. Sticking that drive into another system will crash windows when it goes looking to boot the hardware it was installed with. But this rules out the harddrive as the problem. Registered memory or not isn't really an issue unless its a Dual server board that requires Registered ram to control large sums of ram. Otherwise its a wasted expense since Regiestered ECC isn't cheap. But a bad stick could cause issues.

Running memtest is a good idea to start with.

Here is what I do when I encounter this problem with a customers system:

-replace the cables with ones I know work (I usually don't put cdroms and hardrives on the same IDE channel/cable)

-Make sure if its a newer drive, that it has an 80pin ribbon cable (though the bios would tell you if it wasn't).

-Remove all cards except video and only leave in the ram needed to setup the system (usually just 256).

-Install a known working faster cdrom drive

-Change the IDE header used for the HDD (from IDE slot 1 to 2 or vise versa).

-do an fdisk to the drive whiping all the partions off, as well as doing "fdisk /mbr" to whipe the master boot record, then try a Format -u. This will write zeros to all sectors of the drive and erase any residual boot information left behind.

-Bios flash

-I might swap the PSU encase enough power isnt being transfered over the rails to the drives. This will cause spikes/dips during data transfer and data gets lost or corrupted.

-If I can rule out all the hardware, then I start thinking its the mobo, or more specifically, the IDE controller chip going bad. I had this same thing happen, after swapping the board out into another system, the board stopped reading all IDE devices and never worked after that.

Valkyrien
02-01-2003, 06:35 AM
Originally posted by Sieb
-do an fdisk to the drive whiping all the partions off, as well as doing "fdisk /mbr" to whipe the master boot record, then try a Format -u. This will write zeros to all sectors of the drive and erase any residual boot information left behind.

-Bios flash


whoosh! all that went right over my head...;) fortunately I believe setup will allow me to wipe the partitions anyway, but the rest of that stuff...:surprised thanks for the tips!:)

MadMax
02-01-2003, 06:43 AM
Originally posted by Sieb
Well for one, transplanting a HDD with an OS on it won't work period. Windows sets up the OS with hardware specific information.


Actually that isn't 100% correct. A lot of people who bought nForce2 boards, moved their harddrives from their old systems, and then did a repair install of XP.

While this is not a method I recommend, it does work.

Valkyrien
02-01-2003, 06:50 AM
actually, there was one thing that ME was good for (shoocking, I know!). One time, I brought my HD home with me on the weekend so I could work on some files here. I plugged the HD in in place of my parents', and after the startup sequence during which Windows calibrated for every piece of hardware in my parents' comp, it ran exactly like it did on my computer (minus most of the RAM, and a good Ghz of CPU speed, an inferior GFX card, etc;) )

Valkyrien
02-06-2003, 05:50 PM
I actualy managed to get a few minutes to try some things out...amazing! First off was the cable test. I replaced the new cables with the ones from the old comp that i know worked. Comp still can't find the same files. Tried switching the RAM to another slot, with no success. All extraneous cards were unplugged for both tests. i only left in the video card. Next up: memtest!

Valkyrien
04-27-2003, 10:08 PM
UPDATE!!!!

Finally some good news! The computer place i took my machine to more than a month ago finally determined that my MoBo was bad! It took so long because the ****ing idiots didn't call me to ask questions like whether or not the mobo took XPs as well as MPs, and they wouldn't try other parts without knowing for sure. But it looks like I'll have a working machine by the end of the week! yay!

DaForce
04-27-2003, 11:50 PM
Good to hear, thats some saga you have been thru!!

Tell me, did they try putting in different memory into your computer, or did they just use your ram?

Valkyrien
04-28-2003, 05:09 PM
they used mine. As I suspected, it was just fine. What I do think is that there was some problem with FSB. the number 533 sounded very familiar when I was looking at other mobos the other day. My theory is that the FSB was somehow jacked way way up, and the only reason my processor didn't fry is the same self-defense mechanism that refused to recognize the WinME install disc as a valid system disc;):p

DaForce
04-29-2003, 12:49 AM
hmmm that sounds pretty dodgy.

ahh well we will see when you get your new mobo

Valkyrien
04-29-2003, 06:08 PM
*waits patiently for the phone to ring*

MattClary
04-29-2003, 07:29 PM
Sorry I didn't see this thread sooner. By the end of your first post, I was thinking it was the motherboard. It was the only thing that seemed constant in the equation.

CGTalk Moderation
01-14-2006, 07:00 AM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.