Improve Downloader RAM usage and add optional replace file boolean #54

strombergdev · 2020-07-02T16:46:42Z

No description provided.

frameioclient/download.py

jhodges10 · 2020-07-09T00:45:32Z

frameioclient/download.py

+    self.retry_strategy = Retry(
+        total=3,
+        backoff_factor=1,
+        status_forcelist=[429],


For this, I think you'll want to add more statuses so that it catches on more than just a rate limit error because the expected failure here is not very likely to be a 429. I also think that there might be other ways of ensuring that we retry a failed download.

Ok! Might have been a little quick here. Testing some other error codes.

Do you think we should add a xxhash check here and retry the download if it fails? Because then I can work on a solution to retry 3x times for example if hashes don't match.

That's not a bad idea! If you want to add it, feel free.

Doing the xxhash on download is much easier than on upload, because of the delay on the upload side that would require putting the QC into a queue and then managing that state and creating a background thread to check on it continuously.

Check the latest commit 😉 and yes uploads are a bit trickier!

jhodges10 · 2020-10-02T19:17:51Z

frameioclient/download.py

+    self.http_retry_strategy = Retry(
+      total=3,
+      backoff_factor=1,
+      status_forcelist=[408, 500, 502, 503, 504],


Where did you get this list of error codes? For reference, S3 will only throw a couple of errors we should care about (almost all of which are 4xx not 5xx).

https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#ErrorCodeList

My thinking was that 400, 403, 409 etc was user-related issues so they shouldn't be retried. Instead we should retry temporary server-side errors. But maybe S3 never throws those error codes anyway so that might be incorrect.

What do you suggest? I am at a loss here :)

jhodges10 · 2020-10-02T19:19:10Z

frameioclient/download.py

+          continue
+
+      if not original_checksum:
+        break


If we break here, what does this error look like in the console/logs?

I added this for older files and edge cases without FIO checksum. So it's not really an error, just download the file and skip verification.

jhodges10 · 2020-10-02T19:19:48Z

frameioclient/download.py

+      if not original_checksum:
+        break
+
+      if calculate_hash(final_destination) == original_checksum:


I'd like to return the output path as an absolute path if it's a successful download. I think that's a really useful thing!

Sure! Do you think we should do that for the case above as well where the file isn't verified? My thought was that verification is handled under the hood and we don't inform the calling function of whether it's done or not.

frameioclient/client.py

jhodges10 · 2021-04-15T19:33:01Z

This is going to end up getting rolled into #73

improve downloader ram usage and make replace optional

0de0dca

strombergdev requested review from billyshambrook, jaypadia-frame, Jmeggesto and subsetpark as code owners July 2, 2020 16:46

jhodges10 reviewed Jul 6, 2020

View reviewed changes

frameioclient/download.py Outdated Show resolved Hide resolved

jhodges10 reviewed Jul 7, 2020

View reviewed changes

frameioclient/download.py Show resolved Hide resolved

jhodges10 changed the title ~~improve downloader ram usage and make replace optional~~ Improve Downloader RAM usage and add optional replace file boolean Jul 7, 2020

add requests.Session() and retry strategy

98227e6

jhodges10 reviewed Jul 9, 2020

View reviewed changes

XXHash verify downloads and retry on fail

47e0e98

jhodges10 suggested changes Oct 2, 2020

View reviewed changes

jhodges10 reviewed Oct 2, 2020

View reviewed changes

frameioclient/client.py Show resolved Hide resolved

subsetpark removed their request for review January 26, 2021 14:27

jhodges10 mentioned this pull request Jun 8, 2021

feat(Downloads): Improve efficiency and functionality #79

Merged

jhodges10 added duplicate This issue or pull request already exists enhancement New feature or request labels Jun 10, 2021

billyshambrook removed request for billyshambrook and Jmeggesto July 21, 2022 18:01

strombergdev closed this by deleting the head repository Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Downloader RAM usage and add optional replace file boolean #54

Improve Downloader RAM usage and add optional replace file boolean #54

Uh oh!

strombergdev commented Jul 2, 2020

Uh oh!

Uh oh!

Uh oh!

jhodges10 Jul 9, 2020 •

edited

Loading

Uh oh!

strombergdev Jul 9, 2020

Uh oh!

jhodges10 Jul 20, 2020

Uh oh!

jhodges10 Jul 20, 2020

Uh oh!

strombergdev Jul 20, 2020

Uh oh!

jhodges10 Oct 2, 2020

Uh oh!

strombergdev Oct 4, 2020

Uh oh!

jhodges10 Oct 2, 2020

Uh oh!

strombergdev Oct 4, 2020

Uh oh!

jhodges10 Oct 2, 2020

Uh oh!

strombergdev Oct 4, 2020

Uh oh!

Uh oh!

jhodges10 commented Apr 15, 2021

Uh oh!

Uh oh!

Improve Downloader RAM usage and add optional replace file boolean #54

Improve Downloader RAM usage and add optional replace file boolean #54

Uh oh!

Conversation

strombergdev commented Jul 2, 2020

Uh oh!

Uh oh!

Uh oh!

jhodges10 Jul 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhodges10 commented Apr 15, 2021

Uh oh!

Uh oh!

jhodges10 Jul 9, 2020 •

edited

Loading