Configuring single stream vs multiple stream.
Most of the S3 compatible applications give option to tune data upload to S3 endpoint. One of the important configuration option is to choose between a single stream vs multiple streams for uploading data.
This is very useful if you want to achieve higher throughput. While it can help you in push data faster to Nutanix Objects, there are few things to consider while you fine tune this parameter :
Is single data stream a bad idea :
Absolutely not. Many of the application can live happily with single connection to Nutanix Objects endpoint depending on how much data they are pushing and how quickly they want to push the data .
Single stream is good when :
- Your application/program is not pushing large data to S3.
- Higher RTO is fine.
- When you have limited network bandwidth.
- You don’t have specific requirement w.r.t time taken to upload the data. Eg : your application is generating TB a day but you are Ok if it takes a few hours to day to push that data to Nutanix Objects cluster.
- When you don’t have specific requirements for reading data at a higher speed.
When should you consider multiple streams from one application :
- When Data ingest rate is high.
- When application needs higher throughput.
- Application needs lower RTO.
- Application is generating data at higher and sustained speed, and wants to push data to Nutanix Objects .
How many parallel threads should I have :
There is no straight answer to this. There are multiple aspects which plays role here.
- Network setup. If you don’t have high speed network between your client and Nutanix Objects. Then more stream may make things slower since it may choke the physical network.
- Multiple application configured with same Objects endpoint. And each application is actively pushing data to endpoint.
- Number of load balancers are deployed.
- Number of nodes in the cluster.
- If client can efficiently handle multiple streams.