This program merges multiple word lists into a single output file.
./merge [-o output_file] [-t num_threads] file1 file2 ... fileN
-o, --output FILE
: Specify output file name. Default isoutput.txt
.-t, --threads NUM
: Specify number of threads to use. Default is 4.-h, --help, -?
: Display help message.
- Parse command line arguments to get file names, output file name, and number of threads.
- Read each input file and add its words to an unordered set.
- Create a vector of output file streams, one for each thread.
- Split the set of words into equal-sized chunks for each thread.
- Launch each thread to process its chunk of words and write them to its output file stream.
- Wait for all threads to finish.
- Merge the output files into a single output file.
- Print success message and elapsed time.
The program is implemented in C++ and uses the following libraries:
iostream
andfstream
for file I/O.unordered_set
for storing unique words.vector
for storing output file streams.chrono
for measuring elapsed time.thread
for creating and managing threads.
To achieve optimal performance, the program reads and writes files in chunks and uses multiple threads to process the data in parallel.
To merge the files file1.txt
, file2.txt
, and file3.txt
into the output file merged.txt
using 8 threads, run the following command:
./merge -o merged.txt -t 8 file1.txt file2.txt file3.txt