Read all words from a given text file and print a count for each
Read all words from a given text file and print a count for each
Test.txt contains the following sentence(How much wood would a woodchuck chuck if a woodchuck could
chuck wood.)
This program is supposed to read all words from a given text file (until eof)
and print out a count for each word. The word should be
processed case-insensitive (all capitals), punctuation should be
removed and the output should be sorted by
frequency.
However I've come to a simple problem where it's counting lines and not the words, help a brother out.
Make a translation table for getting rid of non-word characters
dropChars = "!@#$%ˆ& ()_+-={}|\:;"’<>,.?/1234567890"
dropDict = dict([(c, '') for c in dropChars])
dropTable = str.maketrans(dropDict)
Read a file and build the table.
f = open("Test.txt")
testList=list()
lineNum = 0
table = {} # dictionary: words -> set of line numbers
for line in f:
testList.append(line)
for line in testList :
lineNum += 1
words = line.upper().translate(dropTable).split()
for word in words:
if word in table:
table[word].add(lineNum)
else:
table[word] = {lineNum}
f.close()
Print the table
for word in sorted(table.keys()):
print(word, end = ": ")
for lineNum in sorted(table[word]):
print(lineNum, end = " ")
print()
I don't see how a set could be used to the frequency of a word. Could you demonstrate?
– NationzGG
28 mins ago
2 Answers
2
f = open('Test.txt')
cnt=0
for word in f.read().split():
print(word)
cnt +=1
print cnt
This might help you brother...although i am also a newbie in python.
This code:
from collections import Counter
data = open( 'Test1.txt' ).read() # read the file
data = ''.join( [i.upper() if i.isalpha() else ' ' for i in data] ) # remove the punctuation
c = Counter( data.split() ) # count the words
c.most_common()
prints:
[('A', 2), ('CHUCK', 2), ('WOODCHUCK', 2), ('WOOD', 2), ('WOULD', 1), ('COULD', 1), ('HOW', 1), ('MUCH', 1), ('IF', 1)]
I wonder if the code is too short? =)
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Why don't you just split on the space and create a set ?
– Hearner
33 mins ago