Total words: 221113
i: 10253
and: 9033
to: 7521
the: 5601
you: 4793
my: 4584
of: 3787
me: 3686
a: 3413
said: 3140
he: 2866
for: 2865
in: 2747
that: 2648
it: 2443
as: 2408
be: 2379
have: 2302
not: 2247
but: 2053
she: 1918
so: 1898
your: 1861
her: 1830
was: 1755
with: 1656
will: 1572
is: 1527
this: 1385
all: 1239
if: 1168
his: 1156
had: 1145
would: 983
at: 963
what: 945
good: 848
by: 812
mrs: 812
am: 809
sir: 795
him: 783
shall: 736
dear: 732
when: 710
on: 708
no: 707
well: 692
should: 676
are: 661
do: 657
master: 617
one: 602
very: 601
see: 593
upon: 591
from: 589
them: 578
they: 578
may: 575
now: 552
has: 548
then: 546
poor: 529
more: 528
or: 523
how: 521
which: 521
could: 516
pamela: 512
know: 510
can: 508
much: 503
think: 499
we: 494
out: 493
any: 490
say: 487
mr: 466
little: 464
been: 459
must: 456
up: 453
too: 450
such: 447
lady: 446
let: 442
make: 423
an: 407
than: 406
myself: 404
jewkes: 387
come: 374
hope: 370
who: 345
though: 344
did: 338
there: 335
thought: 330
made: 328
Word Counter for Text Comparisons
Overview
Below is a word counter that compares the frequency of words in Pamela; or, Virtue Rewarded by Samuel Richardson to The Life and Adventures of Robinson Crusoe by Daniel Defoe. There are two separate word counters: one includes every word, and the other ignores common words (i.e. and, the, to, very, must). For rendering purposes, only the first 100 words are presented.
Both texts are in the public domain, and this script is intended for educational and research purposes in accordance with fair use guidelines. Both texts provided by Project Gutenberg, located below:
Pamela: https://www.gutenberg.org/cache/epub/6124/pg6124-images.html
Crusoe: https://www.gutenberg.org/cache/epub/521/pg521-images.html
The code for this project is available for use on Github. There, you will find instructions on how to count the words of any .txt file of your choosing.
Link to GitHub: https://github.com/fastball-marty/Word-Counter
Count all words
Counts the frequency of all words in Pamela and Robinson Crusoe and displays the 100 most popular words for each.
Pamela | Robinson Crusoe |
---|---|
|
|
Ignore common words
Counts the frequency of all words in Pamela and Robinson Crusoe excluding words in list of common words.
List of common words: the, and, a, to, of, in, is, you, that, it, he, was, for, on, are, as, with, his, they, at, be, this, have, from, or, one, had, by, but, not, what, all, were, we, when, your, can, said, there, an, each, which, she, do, how, their, if, will, up, other, about, out, many, then, them, these, so, some, her, would, make, like, him, into, time, has, look, two, more, go, see, no, way, could, people, than, first, been, who, its, now, find, long, down, day, did, get, come, made, may, part, me, am, shall, should, very, upon, might, much, such, though, yet, too, any.
Pamela | Robinson Crusoe |
---|---|
|
|
Words exclusive to each novel
Displays the 100 most common words in Pamela that are not in Robinson Crusoe and vice versa.
Pamela | Robinson Crusoe |
---|---|
|
|
Sentiment Analysis (Beta)
Using the sentiment analysis tools nltk and VADER, each sentence in Pamela was given a numerical value for its positive, neutral, negative sentiments. Below are the 15 sentences with the highest negative score, and the 15 sentences with the highest positive score.
compound reflects an overall score calculated from the other values.
VADER was specifically designed for sentiment analysis of text that is often encountered in social media, product reviews, and other short texts, so it does not provide an ideal model. Utilizing custom training models instead of VADER would enhance accuracy by tailoring the analysis to the specific domain of 18th century literature.
Highest Negative Sentiments:
Alas!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.3382}
O frightful!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.5562}
Ruin!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.6239}
shame!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.5255}
disgrace!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.5411}
Foolish!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.3382}
No, no!: {'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.5707}
I cried sadly for vexation;: {'neg': 0.892, 'neu': 0.108, 'pos': 0.0, 'compound': -0.8074}
Wicked, wicked man!: {'neg': 0.876, 'neu': 0.124, 'pos': 0.0, 'compound': -0.7959}
All sadly vile:: {'neg': 0.873, 'neu': 0.127, 'pos': 0.0, 'compound': -0.7845}
sad poor stuff!: {'neg': 0.867, 'neu': 0.133, 'pos': 0.0, 'compound': -0.7574}
Poor, poor man!: {'neg': 0.867, 'neu': 0.133, 'pos': 0.0, 'compound': -0.7574}
I wept bitterly, however;: {'neg': 0.857, 'neu': 0.143, 'pos': 0.0, 'compound': -0.7184}
I meant no harm;: {'neg': 0.851, 'neu': 0.149, 'pos': 0.0, 'compound': -0.6908}
I meant no harm.: {'neg': 0.851, 'neu': 0.149, 'pos': 0.0, 'compound': -0.6908}
Highest Positive Sentiments:
Innocent!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.4003}
Well!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.3382}
Sweet excellence!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.8122}
I welcome!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.5093}
yes, surely!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.7088}
O good God!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.6476}
O help!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.4574}
happy.: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.5719}
O God!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.3382}
Kind, lovely charmer!: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.8858}
contented;: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.34}
Great and good God!: {'neg': 0.0, 'neu': 0.096, 'pos': 0.904, 'compound': 0.8553}
God bless your honour!: {'neg': 0.0, 'neu': 0.101, 'pos': 0.899, 'compound': 0.8356}
I kissed his dear hand:: {'neg': 0.0, 'neu': 0.106, 'pos': 0.894, 'compound': 0.8126}
happy, happy Mr.: {'neg': 0.0, 'neu': 0.119, 'pos': 0.881, 'compound': 0.8126}