Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-6424: QueryableStateIntegrationTest#queryOnRebalance should accept raw text #4549

Merged
merged 7 commits into from
Feb 16, 2018
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@
import org.junit.Test;
import org.junit.experimental.categories.Category;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
Expand Down Expand Up @@ -132,6 +135,21 @@ private void createTopics() throws InterruptedException {
CLUSTER.createTopics(outputTopic, outputTopicConcurrent, outputTopicConcurrentWindowed, outputTopicThree);
}

private List<String> readInputValues(String resourceFileName, int headerLinesToSkip) throws Exception {
List<String> input = new ArrayList<>();
ClassLoader classLoader = getClass().getClassLoader();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some more nits. Can you add final wherever possible: ClassLoader, String filename, BufferedReader, for(final String...), Exception

try (BufferedReader reader = new BufferedReader(new FileReader(classLoader.getResource(resourceFileName).getFile()))) {
int linesSkipped = 0;
for (String line = reader.readLine(); line != null; line = reader.readLine()) {
if (linesSkipped < headerLinesToSkip) {
linesSkipped += 1;
} else {
input.add(line);
}
}
}
return input;
}

@Before
public void before() throws Exception {
Expand Down Expand Up @@ -167,20 +185,7 @@ public int compare(final KeyValue<String, Long> o1,
return o1.key.compareTo(o2.key);
}
};
inputValues = Arrays.asList(
"hello world",
"all streams lead to kafka",
"streams",
"kafka streams",
"the cat in the hat",
"green eggs and ham",
"that Sam i am",
"up the creek without a paddle",
"run forest run",
"a tank full of gas",
"eat sleep rave repeat",
"one jolly sailor",
"king of the world");
inputValues = readInputValues("data/sampleText.txt", 15);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this, it might be better to keep in-memory as default for the case the file does not exist and read the file if it exist overwriting the in-memory data. This keep the test itself cleaner and avoid the header skipping logic that is quite unintuitive.

For large files, I am wondering, if reading the whole file into an array is good enough... What file sized did you have in mind @tedyu ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About 5x to 10x the size of what the array holds.

inputValuesKeys = new HashSet<>();
for (final String sentence : inputValues) {
final String[] words = sentence.split("\\W+");
Expand Down
28 changes: 28 additions & 0 deletions streams/src/test/resources/data/sampleText.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here saying not to add more line(s) above this line ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this twice, it might be preferable to not add this file because the parsing logic skipping the header is non-intuitive. We could instead keep the in-memory data for regular test, but add a check if this file exist and if yes, use the file instead of in-memory data. WDYT?

hello world
all streams lead to kafka
streams
kafka streams
the cat in the hat
green eggs and ham
that Sam i am
up the creek without a paddle
run forest run
a tank full of gas
eat sleep rave repeat
one jolly sailor
king of the world