Forum Discussion

SamiSan's avatar
SamiSan
Explorer | Level 3
9 years ago

delays between chunks using cURL

 I'm trying to built a server script using bash scripting and cURL for upload/download/archive operations through the HTTP API.

 

However, after(!) every append (v2) there is a varying delay between 2-5 seconds until cURL can exit (therefore receive feedback from POST). The bigger the chunk the longer it takes cURL/POST to successfully exit. I've tried sizes varying 1-150MiB but the delay is always there, increasing with filesize. This also happens when uploading single files (< 150MB).

 

This is unacceptable for productive use, since these delays between append uploads can add up to minutes/hours depending on the file size. Multi threading is obviously not possible since consecutive append-POST operations rely on the previous one to finish, so everything needs to be done in a sequential order programmatically speaking. 

 

What I've tried until now:

  • different versions of libcurl
  • different locations (in case this was an internet issue)
  • two different linux distributions (centos/debian)
  • modifying chunk sizes

 

Here is the code for an HHTP append operation between initiating the session and closing it:

 

curl -k -X POST -L --show-error --globoff -i --header "Authorization: Bearer $TOKEN" --header "Dropbox-API-Arg: {\"cursor\": {\"session_id\": \"$SESSION\",\"offset\": $RANGE},\"close\": false}" --header "Content-Type: application/octet-stream" --data-binary @"$CHUNK" "$APPENDURLV2" 2> /dev/null

 

 

 

 

 

 

  • Greg-DB's avatar
    Greg-DB
    Icon for Dropbox Staff rankDropbox Staff
    Thanks for writing this up. For reference, is the delay you're talking about between when curl finishes uploading the data and when the server sends a response, or is it a delay of curl shutting down after receiving the response? In either case, how are you checking/measuring it?

    If it's the latter, I'm unfortunately not familiar enough with the internals of curl to say why it might have a delay like that. (Perhaps someone else on the forum is? Please feel free to chime in if so.)
    • SamiSan's avatar
      SamiSan
      Explorer | Level 3

      Just for clarification once again: delay happens after the actual transmission of the data and waiting for the HTTP response (HTTP/1.1 200 OK etc.). I measure using 'time' from GNU coreutils alongside other linux specific tools and/or on our hardware firewall. Additionaly I use a paket filter to determine when the "HTTP/1.1 200 OK" request come back from Dropbox, and when it does, cURL exits. So the delay comes from waiting for this response.

      Both, during simple file uploads and chunk uploads - the bigger the file/chunk, the longer the delay it seems. In the meantime I uploaded a file using python for testing and the delay seems to be present too.

       

      I also checked in with a friend to check it out and he can confirm this (he uses the JAVA API), but he didn't pay attention before. Maybe other people experience the same but no one notices the delay, because they don't monitor traffic during the execution of their scripts and thus don't correlate the runtime of a script with the actual data transmission time? At least this is my guess.

       

      Here are a few metrics for uploading into the same file into Dropbox but with different chunksizes (I've throttled the upload to constant ~50Mbit/s for testing purposes):

       

      File Size      Chunksize           Total Transmission Time
      100MB         5MB                     2:03 Minutes
      100MB         20MB                  55 Seconds
      100MB         40MB                   51 Seconds

       

      But since the maximum possible chunk size is 150MB including a considerable delay between the chunks, the total time of delays during big file uploads is a HUGE hit. Here is another example including other providers offering HTTP API and using cURL and a fixed upload speed of ~50Mbit/s :

       

      To                                Filesize           Chunksize             Total Transmission Time
      Dropbox                       1000M           150M                     9:29 Minutes
      Cloud Provider "X"      1000M              5M                      3:55 Minutes
      Cloud Provider "Y"      1000M          222M                      3:49 Minutes

       

      So while the transmission speed is the same during upload to different providers, the delays between chunks uploading to drop box almost adds to a double amount of time needed to finish the whole file.

      Needless to say, this HUGE time overhead is not acceptable. The only thing I can explain this with is dropbox calculating a hash for the file server side and then returning a HTTP OK, which also explains the longer delays for bigger chunk sizes.

      • Greg-DB's avatar
        Greg-DB
        Icon for Dropbox Staff rankDropbox Staff
        Thanks! That's helpful. We'll look into it.

About Dropbox API Support & Feedback

Node avatar for Dropbox API Support & Feedback
Find help with the Dropbox API from other developers.5,950 PostsLatest Activity: 9 hours ago
351 Following

If you need more help you can view your support options (expected response time for an email or ticket is 24 hours), or contact us on X or Facebook.

For more info on available support options for your Dropbox plan, see this article.

If you found the answer to your question in this Community thread, please 'like' the post to say thanks and to let us know it was useful!