New build: Restarts or freezes under load with 2 or more sticks of ram

Options
shockwaveharry
edited January 2023 in General Discussion

Good morning! I put together a new WIN 11 PC with a TUF GAMING Z790-PLUS WIFI D4, Core i9 13900k CPU, 2 new G.Skill Ripjaws V 32GB (2 x 16GB) DDR4-3600 RAM sticks, 2 3 moth old G.Skill Ripjaws V 32GB (2 x 16GB) DDR4-3200 sticks and RTX 3060.

The issue I'm having: when I first assembled the PC, including all 4 sticks of RAM, it worked great until I started a game. it would almost immediately shut down and restart.

Long story short, I found I could consistently make it freeze or restart by running a CPU-Z benchmark or stress test. I got to the point of checking the RAM and found that it was stable with 1 stick of RAM but would shut down with 2 or more. I checked all the slots and all the sticks, old and new, and it doesn't matter which sticks I'm using. 1 stick = stable, 2 or more = FAILURE.

So I'm thinking it's not a RAM problem. I put the 1st stick in the slot recommended by the manual. In fact, I can put 1 stick in any slot and it's good, but it will fail with the second stick in any other slot.

Has to be the MB or the CPU, right?

Best Answers

  • shockwaveharry
    edited January 2023 Answer ✓
    Options

    Um... I fixed it. XMP enabled 64 gigs of RAM, overclocked in XTU running 5.5 ghz. Solid as a rock...

    It was my PSU. 750 watts. It wasn't that old, had been working great last week with a Gigabyte board, Ryzen 7 5800x and the same GTX 3060.

    I was gaming for about an hour and the PC suddenly shut down, but this time it didn't restart. No response when pushing the power button, as in no fans, no beeps, no nothing. Just the Aura led's on the MB. I had a DVOM handy and checked the CPU power plug, 0v. The PSU died. I purchased another 750 watt PSU and that did it. Once it started up I knew it had to have been the PSU just shutting down all along. Enabled XMP, success! All 4 sticks of RAM? Operating as designed.I Optimized the ECU in ETU, my PC started breathing fire!

    Thanks for all your help. I learned a lot.

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    Answer ✓
    Options

    @shockwaveharry '

    So all along the system was just momentarily losing power. Makes sense after the PSU finally died. Watch the RAM, it's worth doing a round of memtestx86. You might do a stress test in the OS with something like OCCT as well. 3200 C16, 3600 C18. Entirely possible it's the same IC with a different SPD profile, but you never know.

Answers

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    Options

    @Shock

    Is this happening with the XMP profile loaded or on JEDEC?

    Test with memtestx86: https://www.memtest86.com/

    Just run test 7, 8 and 9. Sounds like a rounding error with the way it happens under load and verify your findings. Also, please list the RAM model. And if you're loading the XMP profile, screenshot what the board is changing when you 'save and exit'.

  • Thanks for the quick reply. I'll get the info and report back.
  • Good morning Mike,

    Test 7, 8 and 9 Pass with 0 errors.

    2 sticks: F4-3200C16D-32GVK

    2 sticks: F4-3600C18D-32GVK

    UEFI settings are default (F5) , XMP is disabled, EZ System Tuning is normal. BTW, if I had EZ Tuning set at fast or extreme, the system would shut down a little more quickly. IE in normal mode with 2 sticks of ram, I'd run a bench mark and then run stress tests 10 seconds at a time. it will fail during the third or fourth stress test. With extreme selected, it would fail during the first moments of running the benchmark.

    Another clue: With default settings and normal tuning, it will not fail (10 or more stress tests) with any one RAM stick. The more sticks I add, the faster it fails.

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    Options

    @shockwaveharry

    Might be unstable on the default JEDEC 1.2V. Mixing 3200 C16 and 3600 C18 could be a problem. It could be the same die with a different SPD profile, or it could be the same die but different binning. Two tests I'd like to try to see if you'll fail.

    Either kit, two sticks only. Enable XMP in the BIOS, so it ups the voltage. If that fails, try the same two sticks, but in the same channel. See if it's becoming unstable in dual channel mode.

  • OK. Enabled XMP and ran the 2 3600 sticks in slots A2 and B2. Failed on the 3rd stress test so no change. I then switched to A2 and B1. 3 attempts t restart and it would only post in safe mode (never did this before). On the restart, the motherboard LED's cycle between red and yellow 3 times before going into safe mode. I couldn't get Windows to start.

    Since the only change was enabling XMP, I disabled it and attempted another restart and it booted normally to Windows.

    I'll have to stop here until this evening.

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    Options

    @shockwaveharry

    Could certainly be an IMC issue. For the single channel test try B1/B2 so it'll run in single channel mode. You might try A1/A2 as well to see if it's potentially a bad channel.

  • @PowerSpec_MikeW

    I spent some more time working on this problem last night. I had previously installed Intel XTU so I decided to use that to benchmark and stress test, just to see if there was a difference. XTU shows a bunch if info and graphs while running the benchmark and stress tests. With I stick installed, I immediately noticed that my CPU package temp jumped way high during the CPU stress test and started thermal throttling. I ran it for 2 minutes and the temps jumped between 60 and 90, slowly creeping upward the longer I ran it. If I added sticks it would run even hotter and shut freeze/restart. It also gave me an option to run a RAM stress test. If I did that, it would almost immediately shut down, even with one stick.

    So I'm thinking my cooler is insufficient. It's a Rosewill 240mm AIO. I assumed it would be more than enough but it's looking like it was a bad choice. I'll swing by Micro Center tonight to see what's available.

    Do you think I'm on the right track? If yes, should I look for another AIO or a big air cooler? BTW I can't go bigger than 240 or 260 in my case. Lian Li Mesh

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    edited January 2023
    Options

    @shockwaveharry

    No, i don't think temps are you problem. A cooler upgrade would improve your overall performance, but a 13900K is going to pull a lot of power and it's going to run hot. On a stress test that loads all cores, I fully expect it to hit 100C and stay pretty close to that mark. I've hit 100C with a custom and a 420MM Rad. It should not be throttling. So the frequency should at least stay above the base clock frequency. You're not going to hit the max boost on an all core stress test. Might look at Lian Li Gallahad 280. Better cooling, you'll run hotter but you'll clock higher and more aggressively as well. Usually able to handle 250W on that cooler.

    It would be worth reseating the CPU and verifying the cooler installation. Uneven pressure on the CPU could create the issues you're experiencing. Could also be a bad board or CPU.

  • @PowerSpec_MikeW

    Thanks for your continued support. I've previously reset the CPU (more than once) and I feel confident the water block install is optimized. Definitely rechecked it more than once.

    I ended up purchasing and installing the Lian Li 240 this evening. With just that change it initially ran about 10 degrees cooler, but would still creep up and eventually restart during a 5 minute test. After that. I tried disabling ASUS Multi Core Enhancement. That change, along with the new AIO let me run a test for 10 minutes never breaking 90 degrees. It indicated power limit throttling during the entire test (because I disabled MCE I think) but never indicated thermal throttling.

    I then went from 1 to all 4 sticks. With 4 installed, I ran the test and slowly watch the temp increase. It started hitting 95+ and thermal throttling at the 4 minute mark. I ended the test at that point but it froze on me.

    Worth mentioning: When I had 4 sticks on the board, I tried enabling the XMP profile but it refused to post. Also, just for fun I tried the speed optimizer in XTU and the computer crashed as soon as I started a test.

    At this point, I'm going to warranty the MB and CPU. I'll let you know how it goes.

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    Options

    @shockwaveharry

    For the purposes of troubleshooting, stick to two sticks from the same kit first. Four sticks is a lot of additional stress on the memory controller and mixing 3600 C18 and 3200 C16 is a gamble. You'd be better off loading the XMP with the 3200 C16 kit, then manually setting the latency for the 3600 C18 kit before you save and exit. Then add the 3600 C18, run everything at basically 3200 C18 to be safe. That may be stable.

    With the temps, I wouldn't worry too much about that. It will indicate thermal throttling, but by the design of the CPU it really isn't throttling in the sense that we're used to. Used to be hitting tjMax would hard throttle the CPU, usually half the multiplier and the performance would tank. This applies to the Ryzen Zen 2, Zen 3, Zen 4 and the Intel 12th and 13th Gen. Overclocking on both platforms for every day stable overclocks is undervolting to lower your temps and allow dynamic overclocking.

    With the Intel 12th/13th Gen there is no power limit on most boards. And Intel with their PL1/PL2 power states which are the short duration, long duration power limits. Well, now their white sheets say PL1 = PL2. There is no short duration power limit. The processor will try to draw well over 300W, you're going to hit a thermal wall. The CPU is designed to set on that thermal wall and maximize performance. If you run Cinebench R23, It will typically sit at 100C, but verify the boost is over the base clock and that's how you measure your cooling performance now.

    On the AMD side, you only really had temperature issues with some of their Zen 3 processors. The 5800X's with the single CCD's ran hot. The 5900X and 5950X were never really an issue. Reason being AMD had a 142W socket power limit. With that many cores you'd hit your power wall well before you ever hit a thermal wall. You could see higher temps overclocking the lower end processors due to this. The Zen 4 7000 CPU's are different. Socket power limit is now 230W. You're going to see higher temperatures with these processors, similarly to what we've seen on the high end Intel 12th/13th Gen CPU's.

  • @PowerSpec_MikeW

    I have a stable, working computer now with 32MB of RAM, but I don't think it's optimized.

    I ended up exchanging the CPU, MB and RAM. Put it together and started testing all over again.

    Here's what I now have: It runs well with default UEFI settings which includes XMP disabled, and I can run 2 sticks of the 16MB 3600 D18 ram in A1 and A2 only. No crash running a benchmark. It also looks like I can run the CPU and memory stress tests indefinitely (each over 30 minutes so far without freezing or restarts). However, the RAM is running at 2666 and 1.2v

    The RAM set up like that is the only thing that works. 4 sticks of 3600? Crash. Running 2 sticks in A2 and B2 as recommended? Crash. XMP 1 or 2 with the working set up?? Immediate crash. Manually setting the voltage and speed? Won't even start.

    On top of that, there's no CPU overclock scenario in either UEFI or XTU that works even with a single stick. Immediate crash.

    As it now sits, this is the best computer I've ever had. I've only played for a few minutes but Forza runs much better than my old PC so I am happy now, provided it continues working as it does now.

    I'd still like to know why I can't get anything to work well with higher settings. Is this strictly a RAM problem? I read Asus has problems with higher RAM settings on it's MB's and I know my Tuf Gaming Z790 is on the lower end of the playing field. Would you recommend a different motherboard? Is my hardware OK, I'm just missing something in settings? I'd really like this PC to work like it's supposed to and be a little future proof.

  • PowerSpec_MikeW
    PowerSpec_MikeW PowerSpec Engineer
    First Anniversary First Comment 5 Awesomes 5 Up Votes
    edited January 2023
    Options

    @shockwaveharry

    A1/A2 is single channel. You're putting a lot of additional load on the CPU with A2/B2 as you're running it dual channel. Test is B1/B2 as well, just to verify it's not a bad slot or bad channel.

    Also if you could, download CPU-Z: https://www.cpuid.com/downloads/cpu-z/cpu-z_2.04-en.exe

    Screenshot the SPD/Memory tab with each module selected. That GSkill RAM used to be almost entirely Hynix CJR/DJR and it was great, but I have seen some NANYA getting mixed in with the newer revs. There's a code on it, that'll tell you the density and IC manufacturer. But CPU-Z will show us that.

We love seeing what our customers build

Submit photos and a description of your PC to our build showcase

Submit Now
Looking for a little inspiration?

See other custom PC builds and get some ideas for what can be done

View Build Showcase

SAME DAY CUSTOM BUILD SERVICE

If You Can Dream it, We Can Build it.

Services starting at $149.99