Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service/s3/s3manager: Move part buffer pool upwards to allow reuse. #2863

Merged
merged 1 commit into from
Oct 1, 2019

Conversation

skmcgrail
Copy link
Member

Fixes #2036 by allowing the reuse/sharing of the part buffer pools for same-sized PartSize upload requests.

Copy link
Contributor

@jasdel jasdel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just needs change log entry.

@skmcgrail skmcgrail merged commit fe72a52 into aws:master Oct 1, 2019
@skmcgrail skmcgrail deleted the s3manager/partPool branch October 1, 2019 21:58
skmcgrail added a commit to skmcgrail/aws-sdk-go-v2 that referenced this pull request Oct 1, 2019
* Synced the V2 SDK with latest AWS service API definitions.

* This update includes breaking changes to how the DynamoDB AttributeValue (un)marshier handles empty collections.

* `service/s3/s3crypto`: Deprecates the crypto client from the SDK ([aws#394](aws#394))
  * s3crypto client is now deprecated and may be removed from the future versions of the SDK.
* `aws`: Removes plugin credential provider ([aws#391](aws#391))
  * Removing plugin credential provider from the v2 SDK developer preview. This feature may be made available as a separate module.
* Removes support for deprecated Go versions ([aws#393](aws#393))
  * Removes support for Go version specific files from the SDK. Also removes irrelevant build tags, and updates the README.md file.
  * Raises the minimum supported version to Go 1.11 for the SDK. Older versions may work, but are not actively supported

* `service/s3/s3manager`: Add Upload Buffer Provider ([aws#404](aws#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when uploading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/s3/s3manager`: Add Download Buffer Provider ([aws#404](aws#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory when copying from the http response body.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when downloading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/dynamodb/dynamodbattribute`: New Encoder and Decoder Behavior for Empty Collections ([aws#401](aws#401))
  * The `Encoder` and `Decoder` types have been enhanced to support the marshaling of empty structures, maps, and slices to and from their respective DynamoDB AttributeValues.
  * This change incorporates the behavior changes introduced via a marshal option in V1 ([#2834](aws/aws-sdk-go#2834))

* `internal/awsutil`: Add suppressing logging sensitive API parameters ([aws#398](aws#398))
  * Adds suppressing logging sensitive API parameters marked with the `sensitive` trait. This prevents the API type's `String` method returning a string representation of the API type with sensitive fields printed such as keys and passwords.
  * Related to [aws/aws-sdk-go#2310](aws/aws-sdk-go#2310)
  * Fixes [aws#251](aws#251)
* `aws/request` : Retryer is now a named field on Request. ([aws#393](aws#393))
* `service/s3/s3manager`: Adds `sync.Pool` to allow reuse of part buffers for streaming payloads ([aws#404](aws#404))
  * Fixes [aws#402](aws#402)
  * Uses the new behavior introduced in V1 [#2863](aws/aws-sdk-go#2863) which allows the reuse of the sync.Pool across multiple Upload request that match part sizes.

* `service/s3/s3manager`: Fix index out of range when a streaming reader returns -1 ([aws#378](aws#378))
  * Fixes the S3 Upload Manager's handling of an unbounded streaming reader that returns negative bytes read.
* `internal/ini`: Fix ini parser to handle empty values [aws#406](aws#406)
  * Fixes incorrect modifications to the previous token value of the skipper. Adds checks for cases where a skipped statement should be marked as complete and not be ignored.
  * Adds tests for nested and empty field value parsing, along with tests suggested in [aws/aws-sdk-go#2801](aws/aws-sdk-go#2801)
skmcgrail added a commit to skmcgrail/aws-sdk-go-v2 that referenced this pull request Oct 1, 2019
### Services
* Synced the V2 SDK with latest AWS service API definitions.

### SDK Breaking changes
* This update includes breaking changes to how the DynamoDB AttributeValue (un)marshier handles empty collections.

### Deprecations
* `service/s3/s3crypto`: Deprecates the crypto client from the SDK ([aws#394](aws#394))
  * s3crypto client is now deprecated and may be removed from the future versions of the SDK.
* `aws`: Removes plugin credential provider ([aws#391](aws#391))
  * Removing plugin credential provider from the v2 SDK developer preview. This feature may be made available as a separate module.
* Removes support for deprecated Go versions ([aws#393](aws#393))
  * Removes support for Go version specific files from the SDK. Also removes irrelevant build tags, and updates the README.md file.
  * Raises the minimum supported version to Go 1.11 for the SDK. Older versions may work, but are not actively supported

### SDK Features
* `service/s3/s3manager`: Add Upload Buffer Provider ([aws#404](aws#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when uploading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/s3/s3manager`: Add Download Buffer Provider ([aws#404](aws#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory when copying from the http response body.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when downloading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/dynamodb/dynamodbattribute`: New Encoder and Decoder Behavior for Empty Collections ([aws#401](aws#401))
  * The `Encoder` and `Decoder` types have been enhanced to support the marshaling of empty structures, maps, and slices to and from their respective DynamoDB AttributeValues.
  * This change incorporates the behavior changes introduced via a marshal option in V1 ([#2834](aws/aws-sdk-go#2834))

### SDK Enhancements
* `internal/awsutil`: Add suppressing logging sensitive API parameters ([aws#398](aws#398))
  * Adds suppressing logging sensitive API parameters marked with the `sensitive` trait. This prevents the API type's `String` method returning a string representation of the API type with sensitive fields printed such as keys and passwords.
  * Related to [aws/aws-sdk-go#2310](aws/aws-sdk-go#2310)
  * Fixes [aws#251](aws#251)
* `aws/request` : Retryer is now a named field on Request. ([aws#393](aws#393))
* `service/s3/s3manager`: Adds `sync.Pool` to allow reuse of part buffers for streaming payloads ([aws#404](aws#404))
  * Fixes [aws#402](aws#402)
  * Uses the new behavior introduced in V1 [#2863](aws/aws-sdk-go#2863) which allows the reuse of the sync.Pool across multiple Upload request that match part sizes.

### SDK Bugs
* `service/s3/s3manager`: Fix index out of range when a streaming reader returns -1 ([aws#378](aws#378))
  * Fixes the S3 Upload Manager's handling of an unbounded streaming reader that returns negative bytes read.
* `internal/ini`: Fix ini parser to handle empty values [aws#406](aws#406)
  * Fixes incorrect modifications to the previous token value of the skipper. Adds checks for cases where a skipped statement should be marked as complete and not be ignored.
  * Adds tests for nested and empty field value parsing, along with tests suggested in [aws/aws-sdk-go#2801](aws/aws-sdk-go#2801)
skmcgrail added a commit to aws/aws-sdk-go-v2 that referenced this pull request Oct 2, 2019
### Services
* Synced the V2 SDK with latest AWS service API definitions.

### SDK Breaking changes
* This update includes breaking changes to how the DynamoDB AttributeValue (un)marshier handles empty collections.

### Deprecations
* `service/s3/s3crypto`: Deprecates the crypto client from the SDK ([#394](#394))
  * s3crypto client is now deprecated and may be removed from the future versions of the SDK.
* `aws`: Removes plugin credential provider ([#391](#391))
  * Removing plugin credential provider from the v2 SDK developer preview. This feature may be made available as a separate module.
* Removes support for deprecated Go versions ([#393](#393))
  * Removes support for Go version specific files from the SDK. Also removes irrelevant build tags, and updates the README.md file.
  * Raises the minimum supported version to Go 1.11 for the SDK. Older versions may work, but are not actively supported

### SDK Features
* `service/s3/s3manager`: Add Upload Buffer Provider ([#404](#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when uploading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/s3/s3manager`: Add Download Buffer Provider ([#404](#404))
  * Adds a new `BufferProvider` member for specifying how part data can be buffered in memory when copying from the http response body.
  * Windows platforms will now default to buffering 1MB per part to reduce contention when downloading files.
  * Non-Windows platforms will continue to employ a non-buffering behavior.
* `service/dynamodb/dynamodbattribute`: New Encoder and Decoder Behavior for Empty Collections ([#401](#401))
  * The `Encoder` and `Decoder` types have been enhanced to support the marshaling of empty structures, maps, and slices to and from their respective DynamoDB AttributeValues.
  * This change incorporates the behavior changes introduced via a marshal option in V1 ([#2834](aws/aws-sdk-go#2834))

### SDK Enhancements
* `internal/awsutil`: Add suppressing logging sensitive API parameters ([#398](#398))
  * Adds suppressing logging sensitive API parameters marked with the `sensitive` trait. This prevents the API type's `String` method returning a string representation of the API type with sensitive fields printed such as keys and passwords.
  * Related to [aws/aws-sdk-go#2310](aws/aws-sdk-go#2310)
  * Fixes [#251](#251)
* `aws/request` : Retryer is now a named field on Request. ([#393](#393))
* `service/s3/s3manager`: Adds `sync.Pool` to allow reuse of part buffers for streaming payloads ([#404](#404))
  * Fixes [#402](#402)
  * Uses the new behavior introduced in V1 [#2863](aws/aws-sdk-go#2863) which allows the reuse of the sync.Pool across multiple Upload request that match part sizes.

### SDK Bugs
* `service/s3/s3manager`: Fix index out of range when a streaming reader returns -1 ([#378](#378))
  * Fixes the S3 Upload Manager's handling of an unbounded streaming reader that returns negative bytes read.
* `internal/ini`: Fix ini parser to handle empty values [#406](#406)
  * Fixes incorrect modifications to the previous token value of the skipper. Adds checks for cases where a skipped statement should be marked as complete and not be ignored.
  * Adds tests for nested and empty field value parsing, along with tests suggested in [aws/aws-sdk-go#2801](aws/aws-sdk-go#2801)
@aws-sdk-go-automation aws-sdk-go-automation mentioned this pull request Oct 2, 2019
@robin865
Copy link

robin865 commented Nov 7, 2019

I have a question. So from looking at this change; this seems to be that for part uploads within the same request; they now share a common buffer pool (partPool). So I do an upload with 10 20 MB parts; now each of those will re-use the same buffer pool.

However, this doesn't seem to ensure that consecutive uses of an uploader itself shares anything? So if I do 10 uploads in a row; and each upload uploads 10 20 MB parts; I will still generate 10 20 MB buffers (1 per upload) whereas before I would have generated 10*10 = 100 20 MB buffers (1 per part).

Is this an accurate description? If so, I don't really think #2036 is fixed since that was about sharing across requests. What I think should be the case is that if my above example; we'd create a single 20 MB buffer; used by every part upload in all 10 requests.

It seems like this "BufferProvider" option is more what I was getting at; however this seems to require a seekable body? In which case; it has no impact on if I'm using the upload manager to do a multipart upload as that takes an io.Reader (not io.ReadSeeker).

Can you clarify if what I'm saying is accurate? Is there a way to have a shared buffer used by all part uploads across all requests?

@jasdel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upload manager does not re-use buffer pools across requests
3 participants