Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-6424: QueryableStateIntegrationTest#queryOnRebalance should accept raw text #4549

Merged
merged 7 commits into from
Feb 16, 2018

Conversation

h314to
Copy link
Contributor

@h314to h314to commented Feb 8, 2018

  • Remove to .toLowerCase in word count stream
  • Add method to read text from file, and move text to resources file
  • Minor cleanup: change argument order in assertEquals

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

…ept raw text

* Remove to .toLowerCase in word count stream
* Add method to read text from file, and move text to resources file
* Minor cleanup: change argument order in assertEquals
@mjsax
Copy link
Member

mjsax commented Feb 8, 2018

Build error is related:

Unknown license: /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk9-scala2.12/streams/src/test/resources/data/sampleText.txt

We cannot just a file without license header. Please move back to in-memory. Thanks.

@h314to
Copy link
Contributor Author

h314to commented Feb 8, 2018

Thanks. I missed that on the output. I'm moving it back to in-memory.

@mjsax
Copy link
Member

mjsax commented Feb 8, 2018

Call for review @tedyu. I am not sure if this PR addresses the JIRA. As you create the JIRA, you should be able to clarify. Thanks.

@tedyu
Copy link
Contributor

tedyu commented Feb 8, 2018

My intention was to allow the test to read from an external file.
File has bigger capacity which can test related code at higher scale.

@mjsax
Copy link
Member

mjsax commented Feb 8, 2018

@tedyu Would the file path/name be hardcoded? Or how to pass in the file name? We need the in-memory fall-back as we cannot just add a file without license header.

An alternative would be, to create the file at startup and read it back during the test.

@tedyu
Copy link
Contributor

tedyu commented Feb 8, 2018

Does data file have to contain license header ?

Hardcoding file path is fine - if the file doesn't exist, we can fall back to the current text.

Thanks

@h314to
Copy link
Contributor Author

h314to commented Feb 8, 2018

If we want to have the text in a file and the file must have a license I could just skip the license while reading, or parse the license lines, adding them to the word count stream, and adjust the test to accommodate this change.

@tedyu
Copy link
Contributor

tedyu commented Feb 8, 2018

Either way is fine.
Probably skipping the header is less work for you.

Thanks

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here saying not to add more line(s) above this line ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this twice, it might be preferable to not add this file because the parsing logic skipping the header is non-intuitive. We could instead keep the in-memory data for regular test, but add a check if this file exist and if yes, use the file instead of in-memory data. WDYT?

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this twice, it might be preferable to not add this file because the parsing logic skipping the header is non-intuitive. We could instead keep the in-memory data for regular test, but add a check if this file exist and if yes, use the file instead of in-memory data. WDYT?

"eat sleep rave repeat",
"one jolly sailor",
"king of the world");
inputValues = readInputValues("data/sampleText.txt", 15);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this, it might be better to keep in-memory as default for the case the file does not exist and read the file if it exist overwriting the in-memory data. This keep the test itself cleaner and avoid the header skipping logic that is quite unintuitive.

For large files, I am wondering, if reading the whole file into an array is good enough... What file sized did you have in mind @tedyu ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About 5x to 10x the size of what the array holds.

@h314to
Copy link
Contributor Author

h314to commented Feb 12, 2018

I've done a few things to try to address your comments. I removed the header skipping logic and now default to a list of strings that is enough to run the tests.

I documented the method to make what it is doing clear, and changed the default file name to resources/QueryableStateIntegrationTest/inputValues.txt which I think is a bit more obvious.

Copy link
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

\cc @tedyu Call for review.

input.add(line);
}
} catch (Exception e) {
input = Arrays.asList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a log saying that internal strings are used since inputValues.txt is absent.

*/
private List<String> getInputValues() {
List<String> input = new ArrayList<>();
ClassLoader classLoader = getClass().getClassLoader();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some more nits. Can you add final wherever possible: ClassLoader, String filename, BufferedReader, for(final String...), Exception

private List<String> getInputValues() {
List<String> input = new ArrayList<>();
ClassLoader classLoader = getClass().getClassLoader();
String fileName = "QueryableStateIntegrationTest/inputValues.txt";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use File.separator instead of /

input.add(line);
}
} catch (Exception e) {
log.warn("Unable to read '{}'. Using default inputValues list", "resources/" + fileName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above: avoid /

Update to

log.warn("Unable to read '{}{}{}'. Using default inputValues list", "resources", File.seperator, fileName);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's certainly safer. Thanks. I also changed it in fileName and added final as requested.

Copy link
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @h314to . Will merge after we get a green build.

@mjsax
Copy link
Member

mjsax commented Feb 15, 2018

Retest this please.

1 similar comment
@mjsax
Copy link
Member

mjsax commented Feb 15, 2018

Retest this please.

@mjsax mjsax merged commit 7303f41 into apache:trunk Feb 16, 2018
@mjsax
Copy link
Member

mjsax commented Feb 16, 2018

Merged to trunk

Thanks for the PR @h314to !

@h314to
Copy link
Contributor Author

h314to commented Feb 16, 2018

Great! Thanks for the helpful reviews.

@h314to h314to deleted the fix/KAFKA-6424 branch February 17, 2018 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants