A cpu does one thing at a time. We have threads in an operating system, these time share so the cpu can complete multiple tasks seemingly at the same time. A multi threaded webserver is the convention when handling multiple webrequests at the same time. A request comes in and the webserver has a thread waiting for it, so that another request can be completed without blocking the cpu on one task alone. Most server configurations have a max concurrent threads type setting. The threads require RAM and attention from the OS, generally most webservers are conceived of as a gateway, piping this request to a common interface running on the OS, and hosting static html files and images. I find nginx the best example of this when using it with php or ruby and mysql, but IIS is the same running a dll and SQL that is linked to the IIS library classes managing the HTTP request and passing the data to the running application.
The webserver processes the HTTP header, doing what the code was compiled to do, confirming the validity and passing it as structured data to your code. The webserver code would be connected to the TCP stack itself, waiting for the stack. The TCP stack does a similar thing on the network interface, there is a chain of events happening for each request. Under heavy load there is generally an increasing delay for new requests to get processed, as the threads fill up on the webserver they have to be cleared before the next request can be taken from the TCP stack. They can process in parallel, and their number can be increased. The TCP stack is a serial interface, the TCP packets are time slices one by one, it receives a continous stream of these that it processes quickly into designated buffers, and it does this very well. The webserver would get its data from one these buffers at the allotted time, depended on buffer sizes etc. the webserver receives this as serial data also, one piece at a time, the webserver has buffers for this while the threads complete themselves. If this buffer fills up it responds with a server unavailable message and will not accept a new request until the thread resource is available again. These tolerances can be fine tuned. Multiple servers can be configured behind a load balancer that can share the load of the requests between multiple servers.
Completing tasks in parallel using threads is cpu intensive, wasting cpu time swapping between them and waiting for input output. A computer program is list of tasks, the code probably requires loops and many functions chained together to complete. running 2 or more of these sequences in parallel completes them at the same time, but is twice or more squared as slow. doing things in a sequence instead of in parallel completes the tasks without the cpu intensive swapping overhead between the tasks, but the subsequent tasks have to wait.
The OS has to manage a stack of activities for each request, and the web application must at its design root adhere to that enviroment.
When asyncronous code is utilized the web application scaling increases dramatically, serving many more requests in parallel than a threaded based solution. The scaling is obtained by fully utilizing the OS and CPU resources. It is known that things like disk access, or requests to a SQL server create a delay, the thread is waiting for a response. Under load while this thread is waiting, the buffers are filling up, but the thread cannot do anything until the wait is complete. locking the application. async does not lock when done correctly, scaling until full cpu resource usage.
Congratulations @m-zero! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):
Your next target is to reach 50 replies.
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
Check out the last post from @hivebuzz:
Support the HiveBuzz project. Vote for our proposal!