Part 1 : Application Setting - Single vs Multiple data stream

Anirudha | Sat, 09/21/2019 - 14:36

Let's look into bit deeper :

Its good idea to start with single stream and verify how much data can be pushed. Increase number of streams linearly until performance is increasing linearly. So we know at what level we start getting constant output e.g you should see linear increase in throughput as we increase the number of stream. At one point you may see the same throughput Or very small change in result. And that’s the time you should start looking into other aspects to see if there is any other bottleneck.

To make this calculation simple, lets consider from one client you are able to push X MBPs in lets say 20 streams. If your application is not able to push data any further, then lets see what all aspects to check here.

Application Settings :

Check how many streams your application opens with Nutanix Objects endpoint.

E.g - if you are using aws python SDK then it creates connection pool with just 10 threads at any given point in time. If your application is using the same connection pool with default setting, then even if you increase more threads to upload data in parallel, they will go through just 10 streams . Which means you will have max 10 connections to your Objects endpoint at any given point in time, and increasing more threads will have minimal to no impact.
- Easy fix could be - try increasing connection pool setting to a higher number and keep it proportional to number of thread you are using
If we increase more threads and also pool size, then you may get bottleneck by client resources. E.g - if you are using just a single processor for your application , then you may end up in spending more time on computing md5sum of the data than pushing the data quickly to Nutanix Objects.
- As a part of PUT request, S3 protocol also needs md5sum of data you will upload. Computing ms5sum needs cpu cycles, if you open more streams and try uploading more objects/files to your Nutanix Objects cluster. Then client may spend a significant amount of time on computing checksum and they it will start data upload. More load on CPU may slow down things drastically.
- Compute md5sum of data before your application is ready to push it to Objects cluster. As stated above, multiple md5sum request can hog CPU resources making everything slow. So it might be good idea to compute this upfront and then start making requests to Nutanix Objects clusters
- Easy fix : Make sure you give enough resources to your client.
Most of the SDK’s does parameter validation before sending request to Objects endpoint. In my experiments, I did not see this as a concern, but you may wanna try disabling it when you create connection using SDK. So it doesn't spend any time in validating parameters
Sending request over http vs https. Https has more CPU overhead which may impact overall throughput. It may not be possible to turn https off, but during your experiments, you may want to try a few tests around it. Based on the requirement, you can use either http Or https.

Using multiple clients vs single client

It might be good idea to consider using multiple clients to push the data to Objects cluster than using one client with multiple threads.

How many applications are configured to same Nutanix Objects cluster

If you are using more than one application pushing data to single cluster then you may wanna check settings of other applications.
You may not be able to achieve high throughput from all applications, since its sharing same Nutanix Objects cluster and based on the workload you will get required throughput

Standard upload vs multipart upload

In single upload you can upload maximum of 5G data either via standard upload or part upload.
But if you want to upload single objects in multiple parallel streams then consider moving your application to multipart upload, so you can upload multiple data chunks in parallel and achieve better throughput.
E.g : Lets say object size is 4G, so you either upload entire object in one stream/connection which is standard put. Or you can divide 4G into multiple parts (lets say 100MB i.e 4G/100MB = 41 parts) and then upload those parts in multiple streams . At the end, just finalize all those parts which then represents one 4G object.

Performance Consideration

Your Comments