OP 15 May, 2021 - 01:55 PM
Skip the appetizers, let's bring out the meal.
There are a lot of data sets in Python however, the fastest (vanilla) one is a set.
I say Vanilla as things such as Numpy are way faster. Though we won't need something
as complex as Numpy for such a simple task. You don't kill an ant with a gun... unless
you're in America.
How do we use sets?
Hold on feisty, let me explain what a set is.
A set is an unordered, unindexed, cannot have duplicates, and objects are immutable.
Not to mention, sets are way faster than lists. Why is this? Let's quickly dive into that.
Set's are implemented using hash tables. Once an object goes into a set, the area within the set's memory is determined by the hash of the object.
This means if you are checking if an object is existing inside of a given set, all you would need to do is check the position determined by the hash.
Now lists are different. If you run the same test, you would need to check the entire list to see if a single object is inside of it, which is why they are
so slow.
Great! Now we know what a set is and why they're fast. How would we actually use this in a duplicates remover?
Previously mentioned, sets cannot have duplicates. I won't be explaining why as that would need you to have an
understanding of Mathematics, Data Structures, and general programming. I'd love to explain that but, I doubt
anyone here really cares x)
Anyways, let's get on with the code.
The simple version:
The "more advanced" version:
While both look completely different, they accomplish the same task.
I'm not going to explain how the 2nd script works as that'll be your homework.
The first script is self-explanatory with some comments to help guide you.
That's about it, tooda-loo!
There are a lot of data sets in Python however, the fastest (vanilla) one is a set.
I say Vanilla as things such as Numpy are way faster. Though we won't need something
as complex as Numpy for such a simple task. You don't kill an ant with a gun... unless
you're in America.
How do we use sets?
Hold on feisty, let me explain what a set is.
A set is an unordered, unindexed, cannot have duplicates, and objects are immutable.
Not to mention, sets are way faster than lists. Why is this? Let's quickly dive into that.
Set's are implemented using hash tables. Once an object goes into a set, the area within the set's memory is determined by the hash of the object.
This means if you are checking if an object is existing inside of a given set, all you would need to do is check the position determined by the hash.
Now lists are different. If you run the same test, you would need to check the entire list to see if a single object is inside of it, which is why they are
so slow.
Great! Now we know what a set is and why they're fast. How would we actually use this in a duplicates remover?
Previously mentioned, sets cannot have duplicates. I won't be explaining why as that would need you to have an
understanding of Mathematics, Data Structures, and general programming. I'd love to explain that but, I doubt
anyone here really cares x)
Anyways, let's get on with the code.
The simple version:
Code:
lines = set() # Creating a set for storing the lines
combos = open('combos.txt', encoding='UTF-8', errors='ignore').readlines() # Reading the contents of combos.txt and storing them inside of a list
for line in combos: # Iterating through the combos list we produced above, and saving the stripped line to our lines set
lines.add(line.strip())
lines = '\n'.join(lines) # Joining the lines in our set with a linebreak
with open('results.txt', 'w') as file: # Saving the results to a file named results.txt
file.write(lines)
file.close()
The "more advanced" version:
Code:
with open('results.txt', 'w') as file:
file.write('\n'.join(set(_.strip() for _ in open('combos.txt', encoding='UTF-8', errors='ignore').readlines())))
file.close()
While both look completely different, they accomplish the same task.
I'm not going to explain how the 2nd script works as that'll be your homework.
The first script is self-explanatory with some comments to help guide you.
That's about it, tooda-loo!
Always confirm via PM before dealing with me.