I have what I think is a rather standard use case but I am nonetheless having difficulties finding a solution.
Basically I am implementing a CPU-bound API server. I did this using txjsonrpc which is a JSONRPC server built on top of twisted.
Unfortunately twisted is not multi-process (or even really multi-threaded) so when an API call hogs the CPU (as it sometimes does) all other calls have to wait til its done.
The natural solution seems to be to setup a multi-process job pool and put jobs on the pool when they come in and then return responses when they're done but this does not really seem to be supported by twisted (i.e. it interferes with twisted internally concurrency mechanism because it would involve blocking calls). Apparently the multiprocess API and twisted do not play well together and Ampoule (twisted's replacement for mulitprocessing) does not seem to be maintained.
So my question is: is there some set of tools or libraries that can help me implement a multi-process server (ideally REST or JSONRPC) or do I have to implement it all myself?
I should add that it would be even better if I could use something like the "distributed" library (built on top of dask) instead of the multiprocess API.
If a CPU bound process and an IO bound process simultaneously request for for IO, which process's request should be accepted first? Why?
This article can be a useful guide for parallelizing non-thread-safe and CPU-bound processes, such as some machine learning models, for purposes such as model to web service conversion. While developing the solution, I tried to pay particular attention to the advantages, disadvantages and pitfalls of python. Maybe there is nothing new for the masters, but I think it is a neat resource for the enthusiasts.
Just thought I'd share an interesting result from something I'm working on right now.
Task: Run ImageMagick in parallel (restrict each instance of ImageMagick to one thread and run many of them at once) to do a set of transformations (resizing, watermarking, compression quality adjustment, etc) for online publishing on large (20k - 60k per task) quantities of jpeg files.
This is a very CPU-bound process.
After porting the Windows orchestration program that does this to run on Linux, I did some speed testing on c5ad.16xlarge EC2 instances with 64 processing threads and a representative input set (with I/O to a local NVME SSD).
Speed on Windows Server 2019: ~70,000 images per hour
Speed on Ubuntu 20.04: ~30,000 images per hour
Speed on Amazon Linux 2: ~180,000 images per hour
I'm not a Linux kernel guy and I have no idea exactly what AWS has done here (it must have something to do with thread context switching) but, holy crap.
Of course, this all comes with a bunch of pains in the ass due to Amazon Linux not having the same package availability, having to build things from source by hand, etc. Ubuntu's generally a lot easier to get workloads up and running on. But for this project, clearly, that extra setup work is worth it.
Much later edit: I never got around to properly testing all of the isolated components that could've affected this, but as per discussion in the thread, it seems clear that the actual source of the huge difference was different ImageMagick builds with different options in the distro packages. Pure CPU speed differences for parallel processing tests on the same hardware (tested using threads running https://gmplib.org/pi-with-gmp) were observable with Ubuntu vs Amazon Linux when I tested, but Amazon Linux was only ~4% faster.
I've posted my issue in the nvidia forums, and the jetson discord, but it hasn't been answered, and I could really used some help. This is the original question I posted:
"I’m not sure if this is the right forum, but I just wanted to start off by saying I’m relatively new to the Linux community, and I’ve only had my Nano 2GB for 1 day. I already got the Hello AI world object detection going, updated the entire system (even overclocked it to 1.9ghz), and switched to LXDE. During the retraining, I even disable the entire GUI and just use puTTY
I’m trying to do the tutorial, where they retrain the neural network to detect fruits (jetson-inference/pytorch-ssd.md at master · dusty-nv/jetson-inference · GitHub 1)
Upon running: " python3 train_ssd.py --data=data/fruit --model-dir=models/fruit --batch-size=1 --workers=0 --epochs=2 " , the system freezes, tries to run it, and 99% of the time, It freezes, and 20 or 30 minutes later or so, I get “Killed” in the log, which is caused by Out Of Memory.
Yes, i’m using a SWAP File (20GB Large) on the fastest consumer microSD available (Sandisk Pro Plus C10, V30, A2)
Any ideas for running the command, or doing further optimization so I can actually do some training?"
People keep saying that this is a GPU issue not being able to access enough VRAM, but my GPU Usage is at 0%, making me think this is a CPU Bound process.
Why does NVidia even sell the 2GB if it can't do it's intended purpose, unless again, I'm doing something wrong?
... there is no denying it.
Stop suggesting people upgrade their CPUs when they state that the game was running perfectly before the last update. I myself have been playing just fine on my i5-7600k since late 2017.
Yes, that CPU is indeed outdated.
Yes, I would get better performances if I bought an i13 17999KXW.
But that didn't keep me from playing at perfectly decent perfs (50-80 fps most of the time) for close to 5 years.
HOWEVER, the last patch completely MURDERED Tarkov's CPU usage. Before the game used around 60-70% of the CPU ressources, with the occasional spike up. Now it is perma-capped at 100%, killing all responsiveness from the rest of the softwares : music freezes, Discord mates stop hearing me, .... Hell, even ingame inputs get frozen sometime (i.e. I let go off the W key but the character keeps running for 1-2 secs). Those things NEVER happened before, and the strangest part is that the FPS are perfectly fine still, I'd even say better than before.
I know I am not the only one.
Last patch brought with it an issue, and I hope the devs are working on a fix. I am not asking for troubleshooting help, I tried everything. It would be great to get an acknowledgement of the issue from the devs, though.
EDIT : someone pointed out that I might be due for repasting my CPU/GPU. They were right, I was even long overdue. However, this did not change anything except better temps. Myu CPU is still capped at 100% all the time.
I always knew EFT as a game was CPU bound.. But I never knew just how.. I run the game at 1440p normally but had to RMA my card recently. The card I am stuck with is a GTX 680.. a 2GB card from 10 years ago that does not even get driver updates (outside of security patches). I am able to get 70-80 FPS on interchange and 110 on factory at 1080p on this card... I have a 6700k clocked at 4.5ghz. I did not even consider the possibility of running this game on this card considering valorant crashed every 60 seconds of playing. Of course my details are set to arse mode and I may have enabled downsampling. Crazy stuff and I am so happy that I wont be without the game.
So near the endgame, you start getting Zerg rushed and make builds that spam the screen up with lots of stuff. This is when the game slows to a crawl.
Is this slowdown CPU or GPU bound? Would throwing more horsepower at it (higher end CPU/GPU) or reducing visual effects help this? Is this the game engine reaching its limits? Would another game engine that's designed to handle this much stuff on the screen help at all?
The slowdown can make things easier but it also kinda makes the endgame scenarios less intense.
(EDIT: I was assuming this configuration was default in most distros, but now I'm starting to think it just might be some Ubuntu/Mint-specific weirdness...)
Linux has a feature called 'autogrouping', which is enabled by default on
most many systems, for scheduling processes (see manual page excerpt below). Essentially it causes the scheduler to act primarily on the nice level set for process groups rather than individual processes.
This generally improves responsiveness and fairness for typical use cases where the nice value is always left untouched at zero anyway, by not giving a task which is split into many processes a larger share of cpu time than one that is a single process despite having the same nice level. While your desktop session (including all apps launched through graphical launchers) typically shares one autogroup, opening a terminal window (which is typically where cpu-heavy background tasks are launched) creates a new autogroup, and background services generally have their own autogroups as well.
Are you with me so far? Here's where it gets screwy: when autogrouping is turned on, the standard per-process nice level only affects scheduling priority relative to other processes in its group. And the
renice commands (and their underlying system calls) are only aware of the traditional per-process nice value; they do not act on autogroups. Autogroup nice level can only be changed by writing to the virtual file at
proc/<pid>/autogroup, and none of the standard utilities for dealing with priority seem to take this into account.
While autogrouping tends to ensure fairness, what if you don't want fairness? What if you want to run a background task in very low priority? So in your terminal, instead of running
make -j32 you run
nice 15 make -j32. Except oops, that actually made no difference! Since its autogroup nice level is still zero and the build you just started has no other processes running in its autogroup, its nice level is irrelevant.
The dark side of autogrouping is that with it enabled, the conventional commands and system calls for setting priority mostly become placebos that don't actually do anything. This means that power users wanting to actually control the priority of their processes, are not getting the result they expect. Also the few programs that set their own nice level (such as [email protected], which kindly attempts to set itself to nice +19)... actually fail in the... keep reading on reddit ➡
I just wanted to know if this bug has been fixed. Powerd has been consuming between 100 and 150% of CPU constantly because of some scheduled power event connected with Mail's delayed send feature and rendered my MBP very sluggish. Has anyone seen if this is still the case on Beta 5? Thanks a lot! MBP is 2019, i7.
I am confused as to what this is and would like to know how to disable it
Games are very latency-sensitive; any increase has significant effects on the frame rate.
I've experienced this myself when trying to run compile jobs in the background while gaming. Even when they're at the lowest scheduling priority (SCHED_IDLE) that can allocate >99% of the CPU resources to other tasks, they still cause my game to lose ~30% of it's average frame rate (not to speak of stability): https://unix.stackexchange.com/questions/684152/how-can-i-make-my-games-fps-be-virtually-unaffected-by-a-low-priority-backgroun
This is likely due to buffer bloat; larger queues -> higher latencies.
RT-kernels are supposed to offer more consistent latency at the cost of throughput which should be a desirable attribute when gaming. 160fps is nice, high throughput but I'd rather have a more consistent 140fps.
Could they help this case or perhaps even generally be useful?
Has anybody done benchmarks on this? The newest I could find is a Phoronix benchmark from 2012 testing Ubuntu's Low-latency kernel which isn't very applicable today I'd say.
How do you even use a RT kernel? Would I have to give my game a specific priority?
I've got a new pc built and I can't finger the problem. There's no PTOS and I've tried a number of fixes. For one unplugging the rig and taking out everything but the processor and its fan, then removing the Battery and waiting to try it again only to get the same results. The manual says the DDR4 ram is in the right slots. Power supply is plugged in right. Don't really want to bread build but haven't tried it yet. Tried it with one RAM in, no ram, GPU out, GPU in. Same result. CPU light then solid DRAM light for a few seconds and so on. I'm sure people have asked this before I just can't shake my concerns.
Honestly worried my CPU got fried at some point but aren't really sure.
Build: MB: Gigabyte B450 Auros M PSU: 650W can't remember brand RAM: 2 8gb Corsair DDR4 ram GPU: RAdeon RX 6700 XT CPU: Ryzen 5 5600x all new parts ordered earlier in the year.
Edit: I forgot to add that the computer being plugged into a hdmi monitor only shows "no signal".
I'm building a little client-side Blazor app for the game Elden Ring. The work that it needs to do takes about 10 seconds and is locking up the UI, and also making the "This page is slowing down Firefox. To speed up your browser, stop this page." message pop up.
I've given BlazorWorker a try in order to get this running in a separate thread/in the background, but it didn't seem to help here. Do I have any options other than to convert this to an ASP.NET hosted app and do the heavy lifting on the server?
I have used a unix based daylight simulation tool at work for the last 20+ years called Radiance (https://www.radiance-online.org). It is a pure cpu bound process and real world design tool.
After a very minor amount of effort I managed to compile it to run native on my new M1 MBA. (Basically XQuartz is still intel only, so I compile the X11 bits for intel and then recompiled the compute parts natively).
The performance is mind blowing.
Running this benchmark test: http://markjstock.org/pages/rad_bench.html
Single core score: 503 seconds --> beats current #1 by 15% (Ryzen 3950)
8 core multi cpu score: 143 seconds --> only beaten by the 3950, a couple of XE intel chips and a dual cpu xeon setup.
Seriously impressive results for a passively cooled laptop with no code optimisation, compile tweaks etc.
Our cluster of commodity i7s is about to be replaced with Mac minis!!!
So I've been pretty vocal in the past about the use of Nvidia Broadcast should you be running an RTX GPU, and while I stand by the fact that it's a GREAT way to remove excess background noise, there is always the option of manual post processing, and I wanted to look into some of the differences between the two of them and discuss which ones people preferred given the option
For LPs where you're editing your audio AFTER recording in Audacity or what have you, there's no reason to not run both, since it means you're getting AI powered noise/echo removal in realtime plus manual post processing to get the best audio possible, but for realtime post processing, such as when you're streaming, there are arguments to not use it, most notably being the AI processed audio sometimes wreaks havoc with OBS's built in audio effects, causing bizzare audio glitches like over-compression. There is also the fact that running a stream using the NVENC encoder, PLUS gaming on the GPU, PLUS using Broadcast can be taxing on it to the point where you may notice performance impacts in game.
So let's breakdown both and see what we're working with in a live environment, and you can be the judge:
|AI powered noise removal||GPU intensive|
|AI powered echo removal||Only usable on RTX cards (or GTX cards with a little bit of trickery and dimishing returns)|
|No audio post processing experience or knowledge needed||No realtime EQ or Compressor built in (as of version 188.8.131.52)|
|On-the-fly dynamic noisegate adjustments performed automatically|
OBS Built In Post Processing
|Can be GPU or CPU bound processing (depending on user preference)||Post processing knowledge needed|
|Manually adjustable to suit user's preferences||No realtime-automatic adjustments made to effects|
|Realtime EQ and compressor as well as noisegate available||Manual configuration required for all post processing effects|
|Useable no matter your build (so long as you're capable of running OBS, which at this point most machines from the past 5 years are)|
This being said, what are people's preferences when picking which one to run? I guess this is a very niche question but figured I'd give it an ask to see what other people's experiences have been, especially because I seem to be enjoying the live environment a lot more these days.
I'm coming back after a break and don't think I realized how bad frame drops are in high pop areas. My CPU typically tops out around 78% to 82% during chest runs while GPU is around 40% to 50%. I would think my bottleneck is CPU but I've never seen it at 90+%.
Edit: Forgot to add, I have a mix of medium and low settings and am playing at 1440p.
The title already has lots of detail, but I'll continue here. I've already runned scans in multiple AV's such as MalwareBytes, Kasperky, and right now i'm running a test in ESET. Throughout all this test i've found LOTS of viruses, specially in windows defender, which was the first av i runned tests on (It found around 13 threats, although, I think most of them were the same threats repeated) (From there comes my obvious suspicion of malware) but after running scans over and over again, it seems to be useless since the problem won't stop (and it's getting harder to get a scan that founds something). I've run most scans on secure mode, so that shouldn't be it.
The CPU stays at a constant temperature of 50°C, the maximum temp i found it on was at 79°C, and that was a one time occasion. That's one of the weirdest things to me, the cpu gets ridiculous amonuts of stress and it stays at good temperatures (although it never drops from 40).
The fact that i've found so many viruses would seem to be the problem, but i keep searching for them and the problem won't stop, the last virus came from a quick scan on MalwareBytes in secure mode, I had to interrupt a Complete scan in MalwareBytes due to a power shortage. But even though it had already scanned 1,300,000~ files already it didn't found nothing, so i'm thinking that wouldn't worked either.
This problems seems out of my scope, so I'm looking for sugestions on what to do next.
Should i check my cpu on the hardware side?
Should i keep running full scans untill i found something?
Most programs in my task manager and alternatives seem legit, so could it be a windows thing? maybe it is a well camuflaged miner.
Hi, i have playing an old game which called metin2. This game every process working on cpu. So graphics, animations also processing datas all doing in cpu. I changed settings from nvidia control panel so game animations not working on cpu. Its works from graphic card. But it didn't solve my problem thats cause freezing game screen. Its happens when you afk in crowded place. Screen gets freeze and a while later its get only black. When i searching this problem i realized that cpu usage increasing while afk. So when i afk process of datas stacking on cpu. Every event around the character not working on cpu and also stacks on and makes cpu usage increase
So i need a script or program thats cleans process of cpu that passed a minute ago. And eventually cpu wont carry useless datas
i recently have a kind of high CPU usage. The culprit seems to be a python3 process. Here is what top shows:
top - 22:07:33 up 3:35, 1 user, load average: 0.98, 1.05, 1.19 Tasks: 284 total, 2 running, 278 sleeping, 0 stopped, 4 zombie %Cpu(s): 26.6 us, 1.6 sy, 0.0 ni, 71.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7875.0 total, 1067.3 free, 4399.3 used, 2408.4 buff/cache MiB Swap: 8108.0 total, 8095.5 free, 12.5 used. 3041.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4904 admin 20 0 501384 81236 11168 R 93.8 1.0 130:28.13 python3 28076 root 20 0 12976 3784 3076 R 6.2 0.0 0:00.01 top 1 root 20 0 170252 10396 7364 S 0.0 0.1 0:02.31 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0+ 8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu+ 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks+ 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks+ 11 root 20 0 0 0 0 S 0.0 0.0 0:00.54 ksoftirqd+ 12 root 20 0 0 0 0 I 0.0 0.0 0:06.38 rcu_sched
Using "ps aux" :
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND admin 4903 0.0 0.0 2616 264 ? Ss 18:32 0:00 /bin/sh -c /usr/bin/python3 ./manage.py worker admin 4904 60.6 0.9 498824 78876 ? Sl 18:32 125:09 /usr/bin/python3 ./manage.py worker admin 4924 0.0 0.7 124904 63956 ? S 18:32 0:02 /usr/bin/python3 ./manage.py worker admin 4925 0.0 0.7 122920 63248 ? S 18:32 0:00 /usr/bin/python3 ./manage.py worker
Is it safe to just kill pid 4904? Or how can i otherwise fix this? Thanks for the help.
Can I configure these jobs to run every 5 minutes instead of every minute? It's causing sonarr to use 100% CPU on one machine and the DL client is using 100% CPU on the other machine.
I don't need to process the downloads every minute.
Obviously, the game becomes harder to run with mods. But, I've got some cash kicking around and with these price drops, I've been considering swapping out my RTX 2060 for either a 3060 or 3060TI. My CPU is a Ryzen 3700x, so I don't feel like the game is TOO much for that, even with mods. (The game runs fine without mods, consistently 90 FPS. With mods, there is a noticeable lag, I feel like the FPS range is somewhere between 50-60.)
Why am I using 48GB RAM running at 2933 MHz (2x8GB 3200 Mhz and 2x16 3600 MHz kits) with my Ryzen 1600? Modded Cities Skylines uses over 30GB memory. I would have considered an Optane drive for page file usage if they were more affordable.
While I do have a RX 570 4GB and a 1080p 60Hz monitor, I intend on upgrading the GPU when prices are more reasonable and getting a second monitor with a new GPU.
The problem is that Cities Skylines remains an absolute CPU hog, and I recently installed mods that extended the game engine's limits for an even larger city, so I expect memory and CPU usage will go up with that.
The other games that I play are Civilization 6 (also CPU bound, especially on huge maps full of AI civs and city states, and I think the mods are likely making it even more of a CPU hog with all of the extra features they added) and occasionally Total War Shogun 2 (single core and FPS falls below 30 in large battles on the Ryzen 1600).
The options I am looking at are:
i5 12400, preferably with a DDR4 motherboard bundle as I would need a new mobo.
i5 12600K with a DDR4 Z690 motherboard bundle deal that I've been looking at for $397: https://www.reddit.com/r/buildapcsales/comments/u9971w/cpuboard_i5_12600k_z690_gaming_x_ddr4_397_makes/
Ryzen 5700 (only if I know for certain if I'll be playing a game that scales to 6-7 cores so Windows 10's background services and other background applications don't impact the gaming)
Ryzen 5800X3D (if I get this, I would be riding the system until DDR6 had already launched...)
I don't plan on any major overclocking. If it's much cheaper for me to go with a non-K edition i5 and non-OCing motherboard, I'll strongly consider that as long as I can still do some RAM OCing as otherwise the motherboard might default to something like 2133/2400 MHz with the 48GB kit (which my B450 board will do on its own if I don't manually OC the RAM).
EDIT, it seems that non-K edition CPUs might have issues with RAM OCing, which makes me concerned as I absolutely do not want to be running my RAM at sub-2933 MHz:
Obviously, the game becomes harder to run with mods. But, I've got some cash kicking around and with these price drops, I've been considering swapping out my RTX 2060 for either a 3060 or 3060TI. My CPU is a Ryzen 3700x, so I don't feel like the game is TOO much for that, even with mods. (The game runs fine without mods, consistently 90 FPS. With mods, there is a noticeable lag, I feel like the FPS range is somewhere between 50-60.) my question is mainly, would upgrading my GPU actually improve my performance as much as I think it would?