Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide request body in request info. #811

Merged
merged 8 commits into from
Oct 23, 2018
Merged

Provide request body in request info. #811

merged 8 commits into from
Oct 23, 2018

Conversation

ArturGaspar
Copy link
Contributor

Fixes #688

Besides the issues mentioned in the added documentation (non-standard HAR, request body not available in certain Lua functions), when this functionality is enabled it breaks the HAR viewer, which always expects postData with type "application/x-www-form-urlencoded" to have the "params" field.

@ArturGaspar
Copy link
Contributor Author

Oops I broke some tests, will reopen when I fix it.

@ArturGaspar ArturGaspar reopened this Sep 21, 2018
@codecov
Copy link

codecov bot commented Sep 21, 2018

Codecov Report

Merging #811 into master will decrease coverage by 0.01%.
The diff coverage is 89.85%.

@@            Coverage Diff             @@
##           master     #811      +/-   ##
==========================================
- Coverage    88.9%   88.89%   -0.02%     
==========================================
  Files          41       41              
  Lines        5670     5726      +56     
  Branches      781      791      +10     
==========================================
+ Hits         5041     5090      +49     
- Misses        459      462       +3     
- Partials      170      174       +4
Impacted Files Coverage Δ
splash/resources.py 87.86% <100%> (+0.08%) ⬆️
splash/request_middleware.py 79.38% <100%> (ø) ⬆️
splash/browser_tab.py 93.93% <100%> (+0.04%) ⬆️
splash/qtrender.py 93.38% <100%> (+0.09%) ⬆️
splash/har_builder.py 78.43% <100%> (ø) ⬆️
splash/qwebpage.py 88.23% <100%> (+0.11%) ⬆️
splash/har/qt.py 94.8% <100%> (+1.7%) ⬆️
splash/render_options.py 95.68% <100%> (+0.03%) ⬆️
splash/qtrender_lua.py 95.8% <100%> (+0.01%) ⬆️
splash/defaults.py 100% <100%> (ø) ⬆️
... and 1 more


Note that request data in :ref:`splash-request-info` is not available in the
callback :ref:`splash-on-response-headers` or in the request of the response
returned by :ref:`splash-http-get` and :ref:`splash-http-post`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it available then? You mentioned HAR, is it available in on_request callbacks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also in on_response (in response.request.info).

I was going to add a test for that now, but I hit what I believe to be a bug, which I reported as #814

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it also has a test for on_response.

# WebCore::FormDataIODevice object, which is sequential. Its
# getFormDataSize() method cannot be accessed through PyQt5,
# but Qt WebKit puts its value in the Content-Length header
# (see WebCore::QNetworkReplyHandler::getIODevice).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great observation 👍

@@ -123,6 +123,38 @@ def createRequest(self, operation, request, outgoingData=None):
self.log(traceback.format_exc(), min_level=1, format_msg=False)
return super(ProxiedQNetworkAccessManager, self).createRequest(operation, request, outgoingData)

def _get_request_body(self, request, outgoing_data):
if outgoing_data is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is often easier to read functions when special cases are handled first:

if outgoing_data is None:
    return None
# ... the rest

this also provides more horizontal space (less nesting).

if outgoing_data.isSequential():
# In a sequential QIODevice, size() returns the value of
# bytesAvailable(), which is only the size of the data in the
# QIODevice buffer and not the total size of the output. Until
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if tests trigger this behavior (data with a size larger than a buffer)? It seems new tests only use very small request bodies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I added a test for it.

@kmike
Copy link
Member

kmike commented Oct 23, 2018

Looks great, thanks @ArturGaspar!

@kmike kmike merged commit fe8c67d into scrapinghub:master Oct 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants