Table of Contents

Name : Ranjana Rajendran

Email : ranjana.rajendran@gmail.com

Topic : Spoiler Alert

Summary

The problem statement is that given the movie_details and revews about that movie from users, alert if the review_text is a spoiler about the movie or not.

In this study, I have performed a study of the input data to see if there is any correlation between the other features and the is_spoiler field. None could be found. The data was exploded on the genres field to see if there is any correlation between the genre and is_spoiler and none was found. The only aspect found was that as the length of review_text increased, that review is more likely to be a spoiler than not.

Given that no other aspect could be found that could contribute towards a review being a spoiler or not, we are left with natural language analysis of the review_text and other fields such as plot_synopsis or plot summary using deep learning techniques.

The review_text, plot_summary, plot_synopsis and review_summary were first pre-processed. They were converted to lowercase (as bert0-base-uncase was uncased), unnecessary white-spaces removed, unicode characters normalized, URLs, html tags and special characters removed, contractions expanded. Experiments were conducted on lemmatization. The data was persisted without lemmatization, howevere, lemmatization was done for Bidirectional LSTM model.

The following deep learning techniques were applied 🇰

  1. Bidirectional LSTM : An accuracy of 0.7656 was achieved on validation dataset and 0.65 on test dataset. This was done ona set of 10000 positive and 10000 negative examples sambpled from the full corpus.
  2. Pre-trained BERT on a sequence of review_text and plot_summary : Accuracy of 0.698 was achieved on validation dataset and 0.681 on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.
  3. Pre-Trained Longformer on a sequence of review_text and plot_summary: Accuracy of 0.721 was achieved on validation dataset and 0.715 on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.
  4. Ensemble for prediction (2 indepenently trained towers) on separately trained 2 BERT models, each on review_text and plot_synopsis separately: Accuracy of 0.43 was achieved on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.

Notes on training :

  1. For the transformer models LoRA https://arxiv.org/abs/2106.09685 optimization was utilized.
  2. Early stopping was employed, utilizing the validation loss as the metric for early stopping.
  3. Given the limitation of time and computation available, the training could be performed only on a small set of the available dataset.

Future work

  1. Train each of these models on a larger dataset.
  2. As you have seen, Lonfomer gave the best results on review_text and plot_summary. I could use a larger dataset and plot_synopsis instead of plot_summary for Longformer and see if this will produce better results than an accuracy of 0.715 on test dataset.

Install required libraries

In [ ]:
!pip install accelerate -U
exit()
Requirement already satisfied: accelerate in /usr/local/lib/python3.10/dist-packages (0.32.1)
Requirement already satisfied: numpy<2.0.0,>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (24.1)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (2.3.1+cu121)
Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from accelerate) (0.23.5)
Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from accelerate) (0.4.3)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (4.12.2)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (1.13.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.1.4)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.10.0->accelerate)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.10.0->accelerate)
  Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.10.0->accelerate)
  Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.10.0->accelerate)
  Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
Requirement already satisfied: triton==2.3.1 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (2.3.1)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.10.0->accelerate)
  Downloading nvidia_nvjitlink_cu12-12.5.82-py3-none-manylinux2014_x86_64.whl (21.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 66.8 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->accelerate) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->accelerate) (4.66.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (2024.7.4)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)
Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.82 nvidia-nvtx-cu12-12.1.105
In [ ]:
! pip install ray
Requirement already satisfied: ray in /usr/local/lib/python3.10/dist-packages (2.32.0)
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from ray) (8.1.7)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from ray) (3.15.4)
Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from ray) (4.19.2)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from ray) (1.0.8)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from ray) (24.1)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.10/dist-packages (from ray) (3.20.3)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from ray) (6.0.1)
Requirement already satisfied: aiosignal in /usr/local/lib/python3.10/dist-packages (from ray) (1.3.1)
Requirement already satisfied: frozenlist in /usr/local/lib/python3.10/dist-packages (from ray) (1.4.1)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from ray) (2.31.0)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (23.2.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (2023.12.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (0.19.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (2024.7.4)
In [ ]:
! pip install datasets
Collecting datasets
  Downloading datasets-2.20.0-py3-none-any.whl (547 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/547.8 kB ? eta -:--:--
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 286.7/547.8 kB 8.4 MB/s eta 0:00:01
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 547.8/547.8 kB 9.0 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from datasets) (3.15.4)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets) (1.25.2)
Collecting pyarrow>=15.0.0 (from datasets)
  Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (39.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.9/39.9 MB 43.0 MB/s eta 0:00:00
Requirement already satisfied: pyarrow-hotfix in /usr/local/lib/python3.10/dist-packages (from datasets) (0.6)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl (116 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 17.4 MB/s eta 0:00:00
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets) (2.0.3)
Collecting requests>=2.32.2 (from datasets)
  Downloading requests-2.32.3-py3-none-any.whl (64 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 10.6 MB/s eta 0:00:00
Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.10/dist-packages (from datasets) (4.66.4)
Collecting xxhash (from datasets)
  Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.1/194.1 kB 26.4 MB/s eta 0:00:00
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 19.1 MB/s eta 0:00:00
Requirement already satisfied: fsspec[http]<=2024.5.0,>=2023.1.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (2023.6.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets) (3.9.5)
Requirement already satisfied: huggingface-hub>=0.21.2 in /usr/local/lib/python3.10/dist-packages (from datasets) (0.23.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets) (24.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets) (6.0.1)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (4.0.3)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.21.2->datasets) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (2024.7.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2024.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.16.0)
Installing collected packages: xxhash, requests, pyarrow, dill, multiprocess, datasets
  Attempting uninstall: requests
    Found existing installation: requests 2.31.0
    Uninstalling requests-2.31.0:
      Successfully uninstalled requests-2.31.0
  Attempting uninstall: pyarrow
    Found existing installation: pyarrow 14.0.2
    Uninstalling pyarrow-14.0.2:
      Successfully uninstalled pyarrow-14.0.2
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires pyarrow<15.0.0a0,>=14.0.1, but you have pyarrow 17.0.0 which is incompatible.
google-colab 1.0.0 requires requests==2.31.0, but you have requests 2.32.3 which is incompatible.
ibis-framework 8.0.0 requires pyarrow<16,>=2, but you have pyarrow 17.0.0 which is incompatible.
Successfully installed datasets-2.20.0 dill-0.3.8 multiprocess-0.70.16 pyarrow-17.0.0 requests-2.32.3 xxhash-3.4.1
In [ ]:
prefix_path = '/content/drive/MyDrive/ML-SwitchUP/Deep Learning Projects/NLP Project/'

Download and prepare the data

In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
import os
import gdown
import zipfile

zip_url = "https://drive.google.com/uc?export=download&id=16zn74iUWV-Qiv4yYkaMY-spr-LJIys9Z"

dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")
zip_file_path = os.path.join(dataset_path, "imdb_spoiler_dataset.zip")

if not os.path.exists(dataset_path):
   os.makedirs(dataset_path)

if not os.path.exists(zip_file_path):
   gdown.download(zip_url, zip_file_path, quiet=False)

with zipfile.ZipFile(zip_file_path ,"r") as zip_ref:
   zip_ref.extractall(dataset_path)

This created 2 json files imdb_reviews.json and imdb_movie_details.json

In [ ]:
# Read reviews from json file

import json
import pandas as pd
import os

dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")

reviews_json_path = os.path.join(dataset_path, "IMDB_reviews.json")
movie_details_json_path = os.path.join(dataset_path, "IMDB_movie_details.json")

reviews = []
with open(reviews_json_path, 'r') as f:
    for line in f:
        reviews.append(json.loads(line))

reviews_df = pd.DataFrame(reviews)

movie_details = []
with open(movie_details_json_path, 'r') as f:
    for line in f:
        movie_details.append(json.loads(line))

movie_details_df = pd.DataFrame(movie_details)

Data pre-processing

In [ ]:
reviews_df
Out[ ]:
review_date movie_id user_id is_spoiler review_text rating review_summary
0 10 February 2006 tt0111161 ur1898687 True In its Oscar year, Shawshank Redemption (writt... 10 A classic piece of unforgettable film-making.
1 6 September 2000 tt0111161 ur0842118 True The Shawshank Redemption is without a doubt on... 10 Simply amazing. The best film of the 90's.
2 3 August 2001 tt0111161 ur1285640 True I believe that this film is the best story eve... 8 The best story ever told on film
3 1 September 2002 tt0111161 ur1003471 True **Yes, there are SPOILERS here**This film has ... 10 Busy dying or busy living?
4 20 May 2004 tt0111161 ur0226855 True At the heart of this extraordinary movie is a ... 8 Great story, wondrously told and acted
... ... ... ... ... ... ... ...
573908 8 August 1999 tt0139239 ur0100166 False Go is wise, fast and pure entertainment. Assem... 10 The best teen movie of the nineties
573909 31 July 1999 tt0139239 ur0021767 False Well, what shall I say. this one´s fun at any ... 9 Go - see the movie
573910 20 July 1999 tt0139239 ur0392750 False Go is the best movie I have ever seen, and I'v... 10 It's the best movie I've ever seen
573911 11 June 1999 tt0139239 ur0349105 False Call this 1999 teenage version of Pulp Fiction... 3 Haven't we seen this before?
573912 3 May 1999 tt0139239 ur0156431 False Why was this movie made? No doubt to sucker in... 2 Go doesn't go anywhere

573913 rows × 7 columns

Add a review_id feature to distincguish between reviews on join.

In [ ]:
from uuid import uuid4
reviews_df['review_id'] = reviews_df.index.to_series().map(lambda x: uuid4())
In [ ]:
movie_details_df
Out[ ]:
movie_id plot_summary duration genre rating release_date plot_synopsis
0 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in...
1 tt1204975 Billy (Michael Douglas), Paddy (Robert De Niro... 1h 45min [Comedy] 6.6 2013-11-01 Four boys around the age of 10 are friends in ...
2 tt0243655 The setting is Camp Firewood, the year 1981. I... 1h 37min [Comedy, Romance] 6.7 2002-04-11
3 tt0040897 Fred C. Dobbs and Bob Curtin, both down on the... 2h 6min [Adventure, Drama, Western] 8.3 1948-01-24 Fred Dobbs (Humphrey Bogart) and Bob Curtin (T...
4 tt0126886 Tracy Flick is running unopposed for this year... 1h 43min [Comedy, Drama, Romance] 7.3 1999-05-07 Jim McAllister (Matthew Broderick) is a much-a...
... ... ... ... ... ... ... ...
1567 tt0289879 Evan Treborn grows up in a small town with his... 1h 53min [Sci-Fi, Thriller] 7.7 2004-01-23 In the year 1998, Evan Treborn (Ashton Kutcher...
1568 tt1723811 Brandon is a 30-something man living in New Yo... 1h 41min [Drama] 7.2 2012-01-13 Brandon (Michael Fassbender) is a successful, ...
1569 tt5013056 Evacuation of Allied soldiers from the British... 1h 46min [Action, Drama, History] 8.1 2017-07-21 The film alternates between three different pe...
1570 tt0104014/ For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21
1571 tt0114142/ The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29

1572 rows × 7 columns

In [ ]:
# Find movies where plot synapsis length is zero
movie_details_df[movie_details_df['plot_synopsis'].str.len() == 0]
Out[ ]:
movie_id plot_summary duration genre rating release_date plot_synopsis
2 tt0243655 The setting is Camp Firewood, the year 1981. I... 1h 37min [Comedy, Romance] 6.7 2002-04-11
10 tt0107131 Three pets (Chance, a young dog unfamiliar wit... 1h 24min [Adventure, Comedy, Drama] 6.9 1993-02-12
11 tt0110364 In Urbania, Ohio, snobby ex-football star Kevi... 1h 47min [Comedy, Family, Sport] 6.3 1994-10-14
22 tt0101700 The story is centered on a microcosm of a post... 1h 39min [Comedy, Crime] 7.7 1992-04-03
24 tt0102768 Henry is a lawyer who survives a shooting only... 1h 48min [Drama, Romance] 6.7 1991-07-10
... ... ... ... ... ... ... ...
1556 tt0285531 Four childhood friends, Jonesy, Beaver, Pete a... 2h 14min [Drama, Horror, Sci-Fi] 5.5 2003-03-21
1564 tt0914798 Young Bruno lives a wealthy lifestyle in prewa... 1h 34min [Drama, War] 7.8 2008-11-26
1566 tt4047038 Centers on the titular holistic detective who ... 1h [Comedy, Mystery, Sci-Fi] 8.4 2017-04-01
1570 tt0104014/ For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21
1571 tt0114142/ The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29

233 rows × 7 columns

In [ ]:
movie_details_df[movie_details_df['plot_summary'].str.len() == 0]
Out[ ]:
movie_id plot_summary duration genre rating release_date plot_synopsis

There are no movies with empty plot_summary

Let us examine the text of a plot summary for a movie for which plot synopsis is empty.

In [ ]:
movie_details_df[movie_details_df['movie_id'] == 'tt0243655']['plot_summary'].item()
Out[ ]:
"The setting is Camp Firewood, the year 1981. It's the last day before everyone goes back to the real world, but there's still a summer's worth of unfinished business to resolve. At the center of the action is camp director Beth, who struggles to keep order while she falls in love with the local astrophysics professor. He is busy trying to save the camp from a deadly piece of NASA's Skylab which is hurtling toward earth. All that, plus: a dangerous waterfall rescue, love triangles, misfits, cool kids, and talking vegetable cans. The questions will all be resolved, of course, at the big talent show at the end of the day.                Written by\nSujit R. Varma"

Let us examine a plot synopsis and summary for which the plot syopsis is not empty

In [ ]:
movie_details_df[movie_details_df['plot_synopsis'].str.len() != 0]
Out[ ]:
movie_id plot_summary duration genre rating release_date plot_synopsis
0 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in...
1 tt1204975 Billy (Michael Douglas), Paddy (Robert De Niro... 1h 45min [Comedy] 6.6 2013-11-01 Four boys around the age of 10 are friends in ...
3 tt0040897 Fred C. Dobbs and Bob Curtin, both down on the... 2h 6min [Adventure, Drama, Western] 8.3 1948-01-24 Fred Dobbs (Humphrey Bogart) and Bob Curtin (T...
4 tt0126886 Tracy Flick is running unopposed for this year... 1h 43min [Comedy, Drama, Romance] 7.3 1999-05-07 Jim McAllister (Matthew Broderick) is a much-a...
5 tt0286716 Bruce Banner, a brilliant scientist with a clo... 2h 18min [Action, Sci-Fi] 5.7 2003-06-20 Bruce Banner (Eric Bana) is a research scienti...
... ... ... ... ... ... ... ...
1563 tt0120655 An abortion clinic worker with a special herit... 2h 10min [Adventure, Comedy, Drama] 7.3 1999-11-12 The film opens with a homeless man (Bud Cort) ...
1565 tt0276751 Twelve year old Marcus Brewer lives with his c... 1h 41min [Comedy, Drama, Romance] 7.1 2002-05-17 Will Freeman (Hugh Grant) is a 38-year-old bac...
1567 tt0289879 Evan Treborn grows up in a small town with his... 1h 53min [Sci-Fi, Thriller] 7.7 2004-01-23 In the year 1998, Evan Treborn (Ashton Kutcher...
1568 tt1723811 Brandon is a 30-something man living in New Yo... 1h 41min [Drama] 7.2 2012-01-13 Brandon (Michael Fassbender) is a successful, ...
1569 tt5013056 Evacuation of Allied soldiers from the British... 1h 46min [Action, Drama, History] 8.1 2017-07-21 The film alternates between three different pe...

1339 rows × 7 columns

In [ ]:
movie_details_df[movie_details_df['movie_id'] == 'tt0105112']['plot_summary'].item()
Out[ ]:
"Former CIA analyst, Jack Ryan is in England with his family on vacation when he suddenly witnesses an explosion outside Buckingham Palace. It is revealed that some people are trying to abduct a member of the Royal Family but Jack intervenes, killing one of them and capturing the other, and stops the plan in its tracks. Afterwards, he learns that they're Irish revolutionaries and the two men are brothers. During his court hearing the one that's still alive vows to get back at Jack but is sentenced and that seems to be the end of it. However, whilst the man is being transported, he is broken out. Jack learns of this but doesn't think there's anything to worry about. But, when he is at the Naval Academy someone tries to kill him. He learns that they are also going after his family and so he rushes to find them, safe but having also been the victims of a failed assassination. That's when Jack decides to rejoin the CIA, and they try to find the man before he makes another attempt.                Written by\nrcs0411@yahoo.com"
In [ ]:
movie_details_df[movie_details_df['movie_id'] == 'tt0105112']['plot_synopsis'].item()
Out[ ]:
'Jack Ryan (Ford) is on a "working vacation" in London with his family. He has retired from the CIA and is a Professor at the US Naval Academy. He is seen delivering a lecture at the Royal Naval Academy in London.Meanwhile, Ryan\'s wife Cathy and daughter Sally are sightseeing near Buckingham Palace. Sally and Cathy come upon a British Royal Guard, and Sally tries to get the guard to react by doing an improvised tap dance in front of him. She\'s impressed when the guard, trained to ignore distraction, doesn\'t react at all, and they leave.As Sally and Cathy walk away from the guard, en route to rendezvous with Ryan, they walk by a stolen cab, in which sit three Ulster Liberation Army terrorists: Kevin O\'Donnell, the driver, as well as Sean Miller (Sean Bean) and his younger brother Patrick. The three are loading bullets into their guns as they prepare to carry out a scheduled ambush on Lord William Holmes, British Secretary of State for Northern Ireland and a distant member of the British Royal Family (the cousin of the Queen Mother). As they finish loading their weapons, another terrorist, Annette, radios to them that Lord Holmes is leaving the palace with his wife and son.As Lord Holmes\'s car leaves the palace, the terrorists\' car falls in line behind him. They follow the car, headed for the ambush point. As they drive, O\'Donnell and the Miller brothers don ski masks.Meanwhile, Ryan is about to cross the street to meet with his wife and daughter. He arrives at the same time that Lord Holmes\'s car is passing through the area.Suddenly, another cab, parked on the side of the road, driven by Annette, pulls out and swerves in front of Lord Holmes\'s vehicle. Simultaneously, O\'Donnell\'s vehicle swerves to a stop behind the car. The Millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car\'s engine. Ryan sees what\'s going on and hastily gets his family to cover just as the bomb is detonated. Moments later, Sean and another accomplice begin shooting into the car, riddling Lord Holmes\'s driver and bodyguard with bullet. The onslaught of bullets does not kill Lord Holmes, who is currently using his body to shield his family. Sean marches over to the rear door and orders Lord Holmes out of the car at gunpoint. Ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind Miller, and disarms him. As Ryan grabs Sean\'s pistol, he gets shot in the left shoulder by another terrorist. He shoots one of the other terrorists in the shoulder. Patrick attempts to flee, but Ryan fatally shoots him in the chest. Seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving Sean to be captured when the authorities arrive. They retreat back to a warehouse, where they change out license plates.While recovering, Ryan is called to testify against Miller as a witness. Subsequently, Miller is convicted on all charges and sentenced to life in prison. Ryan is awarded the order of Knight Commander of the Victorian Order, and eventually returns to the United States.On the possibility that Miller\'s ULA comrades might try to liberate him, the authorities take no chances. To that extent, they set up decoy convoys on the day that Miller is to be transferred to Albany Prison on the Isle of Wight. However, someone manages to tip off the ULA as to which convoy is carrying Miller. Miller\'s motorcade is in transit when it comes to a stop at a drawbridge. The raised bridge, however, is a trap meant to block them in. Moments after the convoy comes to a stop, Miller\'s comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. They then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. One of the guards hesitates and is shot fatally. The other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. O\'Donnell hands Miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. They then depart the scene.Miller and his companions flee on a cargo ship from Britain to Libya, to prepare for their next kidnapping attempt on Lord Holmes. Miller however, cannot shake his anger towards Ryan for killing his younger brother and persuades several members of his entourage to accompany him to the United States on a short mission targeting Ryan and his family.Miller travels to the United States, accompanied by Annette, comrade Ned Clark, and a fourth henchman. Annette and Clark travel to the United States Naval Academy to ambush Ryan as he\'s leaving work. Ryan notices Clark idling nearby. Clark casually walks away, but Ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. His suspicions are well warranted-as he continues walking down the street, he looks in the reflection of a parked van\'s rearview door and sees that Clark, and the stolen car driven by Annette, are following him. Ryan pretends not to see them until he\'s walking past a parked panel truck, at which point he ducks behind the truck. Clark draws a silenced pistol and prepares to corner Ryan, but Ryan attacks him first. Clark quickly gets the upper hand and throws Ryan to the curb. He grabs his pistol and is about to shoot Ryan when he\'s shot dead himself by a US Marine sentry. Annette quickly drives away in her Jeep. Ryan, realizing that Miller is going to target Cathy and Sally, jumps into his car and drives off.Simultaneously, Miller and the other henchman follow Cathy as she picks up Sally from school, with Miller riding shotgun. They prepare to follow the two, but are temporarily held up by a crossing guard. Once the crossing guard clears, they pursue her to US Highway 50. Ryan tries to frantically call Cathy on his car phone, but she is busy answering phone calls from the hospital. As he\'s trying to connect with her, Miller and his henchman recklessly weave in and out of traffic to catch up with Cathy. Ryan eventually gets through to Cathy and tells her to get off the highway and find the nearest police station. Unfortunately, just as the call ends, Miller\'s van comes charging up on her using the left shoulder. Miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. Another car momentarily delays him, but once he has a clear aim, Miller opens fire, spraying Cathy\'s car with bullets. A multicar pileup ensues as Cathy loses control of her car and crashes head-on into a concrete barrier. Ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.Cathy and Sally are airlifted to a nearby hospital for treatment. Enraged over the near-loss of his family, Ryan decides to go back to work for the CIA, having earlier rejected the appeal of his former superior, Vice Admiral James Greer (James Earl Jones).Ryan\'s tireless work leads him to conclude that Miller has taken refuge in a training camp, one of many located in Libya. He also determines that a recent new arrival at the camp may be one of O\'Donnell\'s English informants, Dennis Cooley. Ryan makes his recommendations to his superiors at the CIA. Greer invites Jack to a live feed of an SAS strike team attack on the camp. Everyone in the camp is eliminated but unbeknownst to the CIA and Ryan, Miller and his companions have already fled the camp and are on their way to the US to stage their next attack on Lord Holmes.Lord Holmes decides to visit Ryan at his home to formally present his KCVO. With the aid of Lord Holmes\' traitorous assistant, Miller\'s group tracks Holmes to this location, kills the Diplomatic Security Service agents and Maryland state troopers guarding the house, and attempts once more to kidnap Lord Holmes. Ryan leads Holmes and his family to safety while he attempts to lure Miller and his companions away from his home.The FBI Hostage Rescue Teams are scrambled to pick up Holmes. Upon realising that Ryan is leading them away from Holmes, Miller\'s companions try to persuade Miller to turn around, but an enraged and deranged Miller kills his terrorist companions and continues his pursuit of Ryan. Ryan and Miller fight hand to hand; Miller is killed when Ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.Credits roll just after Caroline Ryan learns the gender of the child she is going to have, and before she tells Jack and Sally.'

The plot synopsis looks too lengthy and may not make much sense. Let us look at the reveiws for this movie and see which ones are spoilers and not.

Look for null values

In [ ]:
# Do all reviews have movie_id associated

reviews_df[reviews_df['movie_id'].isna()]
Out[ ]:
review_date movie_id user_id is_spoiler review_text rating review_summary review_id
In [ ]:
reviews_df.isna().sum()
Out[ ]:
review_date       0
movie_id          0
user_id           0
is_spoiler        0
review_text       0
rating            0
review_summary    0
review_id         0
dtype: int64
In [ ]:
movie_details_df.isna().sum()
Out[ ]:
movie_id         0
plot_summary     0
duration         0
genre            0
rating           0
release_date     0
plot_synopsis    0
dtype: int64

Join reviews and movie_details

In [ ]:
## Join reviews and movie_detail

movie_reviews = pd.merge(movie_details_df, reviews_df, on = ['movie_id'], how = 'outer', indicator = True)
movie_reviews.head()
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 5 March 2008 ur16517420 True The second Tom Clancy novel made into a film (... 6 decent Clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 October 2007 ur13887584 True The second in what looks like becoming the 'Ja... 7 Jack Ryan Chapter II e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 16 September 2013 ur2403537 True I was not a fan of The Hunt For Red October. I... 8 It's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 May 2007 ur0819382 True Jack Ryan (Harrison Ford) is a CIA analyst who... 7 Patriotic Gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 29 February 2004 ur2567338 True This was one of the big summer movies of 1992.... 7 Harrison Ford In Another Tom Clancy Adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
In [ ]:
# movies without any review

movie_reviews[movie_reviews['_merge'] == 'left_only']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
573906 tt0104014/ For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21 NaN NaN NaN NaN NaN NaN NaN left_only
573907 tt0114142/ The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29 NaN NaN NaN NaN NaN NaN NaN left_only
In [ ]:
# reviews without a movie_id

movie_reviews[movie_reviews['_merge'] == 'right_only']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
573908 tt0104014 NaN NaN NaN NaN NaN NaN 29 August 2006 ur0039450 False This must be one of the most erotic films ever... 10 Erotic with an big E c4187589-e37f-4af4-a28d-d22d864c3667 right_only
573909 tt0104014 NaN NaN NaN NaN NaN NaN 16 July 2012 ur5358902 False Tinto Brass is usually referred to as either a... 7 Silly and Sexy 83ddfdd4-f4eb-4dbb-8b43-34c09531e83e right_only
573910 tt0104014 NaN NaN NaN NaN NaN NaN 25 October 2015 ur63416246 False One of the best erotic movies of all times. Mo... 8 Excellento 66bb64ee-b216-4809-bff0-d7ddf2bfbed8 right_only
573911 tt0104014 NaN NaN NaN NaN NaN NaN 14 February 2006 ur2851731 False call me crass, but i preferred the higher titi... 3 horrible male lead 08f1b8d5-2801-4b4c-9348-7ceb54988a44 right_only
573912 tt0104014 NaN NaN NaN NaN NaN NaN 11 April 2008 ur10916790 False I heard Tinto Brass did eroticism, fine. I hea... 1 Go rent a playboy video 1931f6ef-d218-49c3-968f-661406c64ec7 right_only
573913 tt0114142 NaN NaN NaN NaN NaN NaN 25 July 2010 ur6302267 False Yet another erotic thriller involving a web of... 3 The women are hot, things end there 7fd32d47-5622-40c9-9c86-aa419d027e10 right_only
573914 tt0114142 NaN NaN NaN NaN NaN NaN 27 December 2000 ur0982623 False Just a standard softcore flick from the playbo... 8 Standard playboy production - bad script, sexy... 4e1bbcec-957d-454c-9d5b-63a173238d51 right_only

Note that all movies don't have reviews and all reviews don't have a movie associated with it/

Observe that tt0104014/ should be tt0104014 in movie_details. Let us see if there aremore of such formatting erros in movie_id as that is the foreign key between the 2 tables reviews_df and movie_details_df.

In [ ]:
movie_details_df[movie_details_df['movie_id'].astype(str).str.contains('/')]
Out[ ]:
movie_id plot_summary duration genre rating release_date plot_synopsis
1570 tt0104014/ For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21
1571 tt0114142/ The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29
In [ ]:
movie_details_df['movie_id'] = movie_details_df['movie_id'].str.replace('/', '')
movie_details_df['movie_id'] = movie_details_df['movie_id'].str.replace('/', '')
In [ ]:
movie_reviews = pd.merge(movie_details_df, reviews_df, on = ['movie_id'], how = 'outer', indicator = True)
movie_reviews.head()
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 5 March 2008 ur16517420 True The second Tom Clancy novel made into a film (... 6 decent Clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 October 2007 ur13887584 True The second in what looks like becoming the 'Ja... 7 Jack Ryan Chapter II e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 16 September 2013 ur2403537 True I was not a fan of The Hunt For Red October. I... 8 It's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 May 2007 ur0819382 True Jack Ryan (Harrison Ford) is a CIA analyst who... 7 Patriotic Gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 29 February 2004 ur2567338 True This was one of the big summer movies of 1992.... 7 Harrison Ford In Another Tom Clancy Adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
In [ ]:
movie_reviews[movie_reviews['_merge'] == 'left_only']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
In [ ]:
movie_reviews[movie_reviews['_merge'] == 'right_only']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge

Now, there are no more reviews which don't have a movie associated to it and vice versa.

In [ ]:
movie_reviews[movie_reviews['_merge'] == 'both']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 5 March 2008 ur16517420 True The second Tom Clancy novel made into a film (... 6 decent Clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 October 2007 ur13887584 True The second in what looks like becoming the 'Ja... 7 Jack Ryan Chapter II e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 16 September 2013 ur2403537 True I was not a fan of The Hunt For Red October. I... 8 It's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 15 May 2007 ur0819382 True Jack Ryan (Harrison Ford) is a CIA analyst who... 7 Patriotic Gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 Former CIA analyst, Jack Ryan is in England wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 Jack Ryan (Ford) is on a "working vacation" in... 29 February 2004 ur2567338 True This was one of the big summer movies of 1992.... 7 Harrison Ford In Another Tom Clancy Adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
573908 tt0104014 For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21 25 October 2015 ur63416246 False One of the best erotic movies of all times. Mo... 8 Excellento 66bb64ee-b216-4809-bff0-d7ddf2bfbed8 both
573909 tt0104014 For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21 14 February 2006 ur2851731 False call me crass, but i preferred the higher titi... 3 horrible male lead 08f1b8d5-2801-4b4c-9348-7ceb54988a44 both
573910 tt0104014 For a while now, beautiful 24-year-old Diana B... 1h 33min [Comedy, Drama] 5.3 1992-02-21 11 April 2008 ur10916790 False I heard Tinto Brass did eroticism, fine. I hea... 1 Go rent a playboy video 1931f6ef-d218-49c3-968f-661406c64ec7 both
573911 tt0114142 The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29 25 July 2010 ur6302267 False Yet another erotic thriller involving a web of... 3 The women are hot, things end there 7fd32d47-5622-40c9-9c86-aa419d027e10 both
573912 tt0114142 The marriage of David Burgess, a senior execut... 1h 32min [Drama, Thriller] 4.0 1999-01-29 27 December 2000 ur0982623 False Just a standard softcore flick from the playbo... 8 Standard playboy production - bad script, sexy... 4e1bbcec-957d-454c-9d5b-63a173238d51 both

573913 rows × 15 columns

Let us look at the reviws for movie_id 'tt0105112' we were looking at before.

As the problem here is regarding the review_text being a spoiler or not, let us focus on that. Let us look at reviews which are marked true for spoiler vs not true.

In [ ]:
movie_reviews[(movie_reviews['movie_id'] == 'tt0105112') & (movie_reviews['is_spoiler'] == True)]['review_text'].head(1).item()
Out[ ]:
'The second Tom Clancy novel made into a film (\'The Hunt For Red October\' being the first), \'Patriot Games\' opts for a somewhat smaller scale. There\'s no chance World War III will result from anyone\'s actions here, but the CIA finds enough bad business coming from the Irish Republican Army to keep itself and those amazing spy satellites it employs busy. Harrison Ford is now Clancy\'s reliable hero, Jack Ryan, and Ford is a very good fit. In best Clancy fashion, Ryan is a very reluctant hero as well. He hates getting involved in other peoples\' affairs and being forced to settle them, just as he hates being forced to kick ass, but what\'s an ex-Marine CIA agent to do? The villains this time are an ultraviolent faction of the IRA; Ryan gets on their bad side when he busts up an attempt by them to kill the British Home Secretary and his family. The rest of the movie is essentially a tit-for-tat series of confrontations between Ryan/CIA and these IRA murderers. The best scenes in \'Patriot Games\' are the action ones, as well as the parts showing the CIA\'s supersecret and supersophisticated technology at work. Less successful are the scenes having to do with Ryan\'s home life, and his wife and daughter. There is an upper-class, peachy-keen smugness to them, exemplified by Anne Archer (Cathy Ryan) and her "boy, this CIA stuff is a pain in the ass" attitude. This type of material is not Clancy\'s strong suit (or the screenwriter\'s, apparently) and it shows. The trio of Irish baddies are all portrayed convincingly; Patrick Bergin, Polly Walker, and particularly Sean Bean in a menacing and venomous performance. \'Patriot Games\' succeeds mainly because of Harrison Ford\'s tough, likable, and sometimes vulnerable presence. Several key plot points do not bear close scrutiny; it strikes one as unlikely the IRA and the CIA would take on one another over what is essentially a personal vendetta. But don\'t think too hard or too much and you\'ll find \'Patriot Games\' a reasonably exciting thriller.'
In [ ]:
movie_reviews[(movie_reviews['movie_id'] == 'tt0105112') & (movie_reviews['is_spoiler'] == False)]['review_text'].head(1).item()
Out[ ]:
'If I were to make a quick summary of this movie, it would be "a good, but not great, action flick."  I don\'t think this movie was intended to be innovative, to be great, to be memorable.  What it was intended to be was entertaining, and it lived up to this expectation.Harrison Ford is the master of remaining as low-key as possible.  He has had so many films where he can over-emphasize, over-dramatize, and simply over-act.  Ford has the maturity to keep this from happening, and he does so again in this film.  Ford is supported quite well with such actors/actresses as James Earl Jones (reprising his role as Greer from The Hunt for Red October), Samuel L. Jackson, and Anne Archer.  For an action flick, acting is much above average.It was also suspenseful...thrilling.  I thought the ending had a nice build-up and climax.  I realize this is different from the book\'s ending...but remember, these are two different visions...not the same one. This brings up a big point in movie-watching:  Comparing a movie to a book is the worst thing you can do.  They are two completely different genres. More importantly, as I said before, they are two different visions.I\'m not a big action fan.  I used to be, but anymore most stories are the same.  To this end I would say Patriot Games is fairly mainstream. However, it was always fun to watch and sometimes thrilling to watch.  I\'m also not a big Tom Clancy fan, but I\'ve given the movies based off his books a shot because again they are different visions.  The result is (at least with Patriot Games) a good film to watch on a Saturday night with a bowl of popcorn.  Don\'t expect great things, but don\'t expect to be disappointed, either.Rating:  7/10'

The review_text follows from the plot summary. Plot synopsis could have too much information not in the plot summary. Given the limit on the seq length acceptable for the deep learning models, plot_summary could be a more useful feature.

Examine the reviews for the movie for which plot synopsis is of length 0

In [ ]:
movie_reviews[movie_reviews['movie_id'] == 'tt0243655']['is_spoiler'].value_counts()
Out[ ]:
is_spoiler
False    193
True      20
Name: count, dtype: int64
In [ ]:
movie_reviews[(movie_reviews['movie_id'] == 'tt0243655') & (movie_reviews['is_spoiler'])]['review_text'].head(1).item()
Out[ ]:
'"Not ANOTHER \'Another Teen Movie,\'" is the accompanying groan to most introductions to Wet Hot American Summer. How could a title and premise that simultaneously stupid and generic be anything more than the sum of its exorbitantly familiar parts? It doesn\'t help that every plot summary reads roughly exactly how you\'d expect it to; even stray quotations out of context play like recycled outtakes from American Pie. If anything, it seems amazing how barefaced the film is about its complete unoriginality. So do yourself a favour \x96 stop reading plot summaries, perusing online quotes, and just go see the film instead. Don\'t watch the trailer. Even stop reading this review. Stop it. Please stop.What are you doing still reading? I asked you nicely and everything! \x85I have failed at my objective.The inevitable twist, of course: all of the above claims are actually true. And yet, somehow, Wet Hot American Summer remains an almost uncomfortably enjoyable watch. This is a film that gleefully tries to hide how far-removed from the film you think it\'s going to be, patiently wearing the mask of a normal \'Murrcan teenage coming-of-age camp drama, filled with first kisses, first times, illicit substances, and plenty of second-hand embarrassment. Don\'t get me wrong: those things are all still there. But the schmaltz of the standard framework has been jettisoned for something altogether livelier, funnier, and even more honest. \'Parody\' is the term used to encompass the film\'s approach to genre, but it may as well function as a parody of expectations of normal narrative beats and cohesion. It\'s almost as if writer/director David Wain and writer/star Michael Showalter have tossed a pinch of magic (or at least \'special sauce\') into the pressure cooker of rote garbage and let it rip with no lid, exploding everywhere in a mess too delicious to be turned down (or something\x85 I think that metaphor got away from me). The best joke of all? It works. Weirdly well. No, this is a film whose heart and inspiration lies somewhere between the SNL-spin off insanity of Meatballs and National Lampoon, blended with a healthy helping of the surrealistic insanity of Monty Python, and culminating in just enough heartfelt personal experience to feel strangely honest, in spite of the trappings. Narratively, it plays like a collection of improv sketches with only a rough template to eventually veer back to \x96 an Anchorman before Anchorman, if you will. Wain and Showalter take as much pleasure in making you fall in love with their earnest, archetypal characters, as they do pulling the rug out from what you would traditionally expect or even want to happen to them. Zaniness abounds, but somehow, by pushing the tropes to hyperbolic, with a side helping of irreverent and insane, ye olde teenage love subplot clichés feel far more heartfelt and credible than they would in the far flashier and stupider films that birthed them. Like any good comedy, this is a film that establishes its own internal world of conventions and possibilities. Yes, this is a world where Paul Rudd, upon finding out that, due to his negligence as a lifeguard, one of his campers has drowned, responds by offhandedly disposing of the witness\x85 and we\'re expected to laugh. And if laugh you do, the film\'s got another one for ya. Some of the bits are shrewder satirical genius (the "into town" sequence hysterically lampoons every parent\'s stern lectures about the escalation of substance use), but the film steadily gives in to being willfully deranged, building to a finale so ridiculous it feels like the apex of a drug trip\x85and, thus, entirely appropriate. In many ways, the film\'s biggest gag is its oddly prophetic assembly of actors who would go on to become comedic superstars, and an extra level of hilarity is derived from watching them irreverently having a blast here. Rudd steals the show, exquisitely uncompromising as the camp\'s resident pretty-boy-dirtbag, but Amy Poehler is a close second, perfecting her routine personas as both earnestly inspirational and ferocious control freak. Bradley \'blink-and-you- might-recognize-him\' Cooper, of all people, ends up headlining a romantic subplot that leads to the most unexpectedly tender moments of the film, while Christopher Meloni unabashedly steals every scene he appears in, a comedic triumph as the camp\'s shellshocked cook. Janeanne Garofalo and David Hyde-Pierce are both brilliantly in their element, blending sardonic barbs and rambling sweetness, as the camp\'s director and resident astrophysicist respectively, while Showalter himself is adorably fumbling as the lovestruck nerd (alongside a surprise cameo appearance) and Marguerite Moreau delivers some of the film\'s most incredibly acerbic dialogue while impressively never dropping her beaming smile. Molly Shannon uncorks a hysterically belligerent meltdown like none other, and Elizabeth Banks is hilariously dopey as one of the camp\'s seemingly endless slew of nymphomaniacs (go figure). Comedy this bonkers is certainly not for all tastes, but it\'s worth at least conceding which tastes the film should appeal to, which the film\'s latent cult success should attest to. Ultimately, in Wet Hot American Summer, coherence doesn\'t really matter: like an appropriately wild party, by the end of the film, it\'s hard to remember who was involved in which subplot, and who\'s kissed who, until you realize it hardly matters, and to just go along with the ride, and leave with Jefferson Starship\'s "Jane" still pounding in your head. Now, if you\'ll excuse me, I\'m going to go hump the refrigerator. -8/10'
In [ ]:
movie_reviews[(movie_reviews['movie_id'] == 'tt0243655') & (movie_reviews['is_spoiler'] == False)]['review_text'].head(1).item()
Out[ ]:
'The strangest point in this film is a point, about a half hour through, when all of the seemingly normal camp counselors go out in to the city.  In a montage shot, they slowly go from reading books at the library, to smoking cigarettes, to smoking joints, to buying cocaine from a guy on the street, to becoming prostitutes, then becoming strung out heroin-addicts at the local crack house.  This is not your parent\'s parody movie.From this point on, the film is never the same.  Seemingly normal character development goes out the window, and characters jump from one complete different personality to another within seconds.  We get brilliant lines of dialogue that could only be brought from members of The State, like: "Hey, there\'s a problem.  I\'ve got something I need to tell you." "Oh no!  You have crabs."  "No.  Well, yeah, but that\'s not the problem."  "Oh good."Pure genius!!!  Why don\'t more comedies have lines of dialogue like this?Some other great points: "There is a way we could save everyone\'s lives. Well, no that couldn\'t work.  In order for it to work, we would need to have a device that could randomly generate numbers between 1 and 20." "That\'s impossible.  That would take some sort of highly advanced supercomputer to work."  "Not necessarily.  No dungeon master goes anywhere without his...20 sided die."I sort of wish I had known what films were being parodied in this movie.  I saw strands that sort of matched what I had seen from old Summer Camp movies I had seen back in high school during the wee hours in the morning during comedy central.  But, there were many things here that seemed like they must be references to other films.   Oh well.  It\'s just all crazy.Anyway, I guess I should say that this isn\'t a really good...movie, per se, but...well, I have no excuses for it.  Wet Hot American Summer rules! Dolphins suck it!!!'

The review_text for this movie looks arbitrarily marked as spoiler or not irrespective of they have the plot or not. So we will remove those movies from the dataset for which the plot_synopsis is empty.

In [ ]:
movie_reviews = movie_reviews[movie_reviews['plot_synopsis'].str.len() != 0]
In [ ]:
movie_reviews[movie_reviews['movie_id']== 'tt0243655']
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge

Preprocess all text columns for transfomer tokenization

Lowercasing

In [ ]:
# Lowercasing
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: x.lower())
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: x.lower())
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: x.lower())
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: x.lower())
In [ ]:
movie_reviews
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 5 March 2008 ur16517420 True the second tom clancy novel made into a film (... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 October 2007 ur13887584 True the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 16 September 2013 ur2403537 True i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 May 2007 ur0819382 True jack ryan (harrison ford) is a cia analyst who... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 29 February 2004 ur2567338 True this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
573901 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur78983408 False dunkirk is a beautifully done movie that has h... 10 possibly the greatest war film ever made ef14e72a-77cd-4b9c-9f85-37d01c55fded both
573902 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur53645293 False dunkirk is one of the rare cases a film receiv... 10 why this could actually be one of the best war... d4bfc31d-ae8e-4e48-b616-936132153630 both
573903 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 31 July 2017 ur45202903 False film gave insufficient background on what was ... 2 disappointed e47d2365-78d9-44c1-86e8-9c02d190b0cb both
573904 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 28 July 2017 ur1319404 False in screen writing, a shot is an image captured... 7 dunkirk for millennials d985d458-0b1b-4cec-a5d2-6e9431770ce0 both
573905 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 25 July 2017 ur4161069 False in a movie that entirely engulfs you it's rath... 10 christopher nolan reaches yet another new high... 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 both

538828 rows × 15 columns

Removing unnecessary white spaces

In [ ]:
# Removing unnecessary whitespaces
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: ' '.join(x.split()))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: ' '.join(x.split()))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: ' '.join(x.split()))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: ' '.join(x.split()))
In [ ]:
movie_reviews
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 5 March 2008 ur16517420 True the second tom clancy novel made into a film (... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 October 2007 ur13887584 True the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 16 September 2013 ur2403537 True i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 May 2007 ur0819382 True jack ryan (harrison ford) is a cia analyst who... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 29 February 2004 ur2567338 True this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
573901 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur78983408 False dunkirk is a beautifully done movie that has h... 10 possibly the greatest war film ever made ef14e72a-77cd-4b9c-9f85-37d01c55fded both
573902 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur53645293 False dunkirk is one of the rare cases a film receiv... 10 why this could actually be one of the best war... d4bfc31d-ae8e-4e48-b616-936132153630 both
573903 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 31 July 2017 ur45202903 False film gave insufficient background on what was ... 2 disappointed e47d2365-78d9-44c1-86e8-9c02d190b0cb both
573904 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 28 July 2017 ur1319404 False in screen writing, a shot is an image captured... 7 dunkirk for millennials d985d458-0b1b-4cec-a5d2-6e9431770ce0 both
573905 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 25 July 2017 ur4161069 False in a movie that entirely engulfs you it's rath... 10 christopher nolan reaches yet another new high... 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 both

538828 rows × 15 columns

Normalize unicode characters

In [ ]:
# Normalize unicode characters
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))

import unicodedata

def remove_accents(input_str):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: remove_accents(x))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: remove_accents(x))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: remove_accents(x))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: remove_accents(x))
In [ ]:
movie_reviews
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 5 March 2008 ur16517420 True the second tom clancy novel made into a film (... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 October 2007 ur13887584 True the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 16 September 2013 ur2403537 True i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 15 May 2007 ur0819382 True jack ryan (harrison ford) is a cia analyst who... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan (ford) is on a "working vacation" in... 29 February 2004 ur2567338 True this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
573901 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur78983408 False dunkirk is a beautifully done movie that has h... 10 possibly the greatest war film ever made ef14e72a-77cd-4b9c-9f85-37d01c55fded both
573902 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur53645293 False dunkirk is one of the rare cases a film receiv... 10 why this could actually be one of the best war... d4bfc31d-ae8e-4e48-b616-936132153630 both
573903 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 31 July 2017 ur45202903 False film gave insufficient background on what was ... 2 disappointed e47d2365-78d9-44c1-86e8-9c02d190b0cb both
573904 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 28 July 2017 ur1319404 False in screen writing, a shot is an image captured... 7 dunkirk for millennials d985d458-0b1b-4cec-a5d2-6e9431770ce0 both
573905 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 25 July 2017 ur4161069 False in a movie that entirely engulfs you it's rath... 10 christopher nolan reaches yet another new high... 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 both

538828 rows × 15 columns

In [ ]:
movie_reviews.head(1)['plot_synopsis'].item()
Out[ ]:
'jack ryan (ford) is on a "working vacation" in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan\'s wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she\'s impressed when the guard, trained to ignore distraction, doesn\'t react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists: kevin o\'donnell, the driver, as well as sean miller (sean bean) and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family (the cousin of the queen mother). as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes\'s car leaves the palace, the terrorists\' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o\'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes\'s car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes\'s vehicle. simultaneously, o\'donnell\'s vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car\'s engine. ryan sees what\'s going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes\'s driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean\'s pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller\'s ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller\'s motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller\'s comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o\'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he\'s leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warranted-as he continues walking down the street, he looks in the reflection of a parked van\'s rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he\'s walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he\'s shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he\'s trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller\'s van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy\'s car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes head-on into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the near-loss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer (james earl jones).ryan\'s tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o\'donnell\'s english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes\' traitorous assistant, miller\'s group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller\'s companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand; miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally.'

Remove URLs, HTML tags and special characters

In [ ]:
# Remove URLs. HTML tags and special characters

import re

def remove_urls(text):
    # Remove URLs
    text = re.sub(r'http\S+', '', text)  # Remove http URLs
    text = re.sub(r'www\S+', '', text)   # Remove www URLs

    # Remove HTML tags
    text = re.sub(r'<.*?>', '', text)

    # Remove special characters
    text = re.sub(r'[^\w\s.,?!\']', '', text)  # Remove special characters
    return text

movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: remove_urls(x))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: remove_urls(x))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: remove_urls(x))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: remove_urls(x))
In [ ]:
movie_reviews.head(1)['plot_synopsis'].item()
Out[ ]:
"jack ryan ford is on a working vacation in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she's impressed when the guard, trained to ignore distraction, doesn't react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists kevin o'donnell, the driver, as well as sean miller sean bean and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaves the palace, the terrorists' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan sees what's going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean's pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller's ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller's motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller's comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he's leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warrantedas he continues walking down the street, he looks in the reflection of a parked van's rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he's walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he's shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he's trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller's van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy's car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes headon into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally."
In [ ]:
movie_reviews
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 5 March 2008 ur16517420 True the second tom clancy novel made into a film '... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 October 2007 ur13887584 True the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 16 September 2013 ur2403537 True i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 May 2007 ur0819382 True jack ryan harrison ford is a cia analyst who s... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min [Action, Thriller] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 29 February 2004 ur2567338 True this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
573901 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur78983408 False dunkirk is a beautifully done movie that has h... 10 possibly the greatest war film ever made ef14e72a-77cd-4b9c-9f85-37d01c55fded both
573902 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur53645293 False dunkirk is one of the rare cases a film receiv... 10 why this could actually be one of the best war... d4bfc31d-ae8e-4e48-b616-936132153630 both
573903 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 31 July 2017 ur45202903 False film gave insufficient background on what was ... 2 disappointed e47d2365-78d9-44c1-86e8-9c02d190b0cb both
573904 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 28 July 2017 ur1319404 False in screen writing, a shot is an image captured... 7 dunkirk for millennials d985d458-0b1b-4cec-a5d2-6e9431770ce0 both
573905 tt5013056 evacuation of allied soldiers from the british... 1h 46min [Action, Drama, History] 8.1 2017-07-21 the film alternates between three different pe... 25 July 2017 ur4161069 False in a movie that entirely engulfs you it's rath... 10 christopher nolan reaches yet another new high... 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 both

538828 rows × 15 columns

Expand contractions

In [ ]:
# Expand contractions

import re

# Contractions dictionary
contractions = {
    "aren't": "are not", "can't": "cannot", "couldn't": "could not", "didn't": "did not",
    "doesn't": "does not", "don't": "do not", "hadn't": "had not", "hasn't": "has not",
    "haven't": "have not", "he'd": "he had", "he'll": "he will", "he's": "he is", "i'd": "i had",
    "i'll": "i will", "i'm": "i am", "i've": "i have",
    "I'd": "i had", "I'll": "i will", "I'm": "i am", "I've": "i have", "isn't": "is not",
    "let's": "let us", "mightn't": "might not", "mustn't": "must not", "shan't": "shall not",
    "she'd": "she had", "she'll": "she will", "she's": "she is", "shouldn't": "should not",
    "that's": "that is", "there's": "there is", "they'd": "they had", "they'll": "they will",
    "they're": "they are", "they've": "they have", "we'd": "we had", "we're": "we are",
    "we've": "we have", "weren't": "were not", "what'll": "what will", "what're": "what are",
    "what's": "what is", "what've": "what have", "where's": "where is", "who'd": "who had",
    "who'll": "who will", "who're": "who are", "who's": "who is", "who've": "who have",
    "won't": "will not", "wouldn't": "would not", "you'd": "you had", "you'll": "you will",
    "you're": "you are", "you've": "you have"
}

# Function to expand contractions
def expand_contractions(text, contractions_dict):
    #text = text.replace("\\'", "'")

    contractions_pattern = re.compile('({})'.format('|'.join(contractions_dict.keys())), flags=re.IGNORECASE | re.DOTALL)
    def replace(match):
        print(match)#
        match = match.group(0)
        first_char = match[0]
        expanded_contraction = contractions_dict.get(match) if contractions_dict.get(match) else contractions_dict.get(match.lower())
        expanded_contraction = first_char + expanded_contraction[1:]
        return expanded_contraction

    expanded_text = contractions_pattern.sub(replace, text)
    return expanded_text

# Example usage
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: expand_contractions(x, contractions))


#if expanded_contraction is None:
#            return match
Streaming output truncated to the last 5000 lines.
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(4, 11), match="where's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(7, 14), match="where's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(40, 46), match="aren't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(46, 50), match="i've">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(37, 41), match="he's">
<re.Match object; span=(9, 15), match="hadn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(3, 12), match="shouldn't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(17, 23), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 48), match="isn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(7, 11), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(41, 47), match="that's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(17, 22), match="won't">
<re.Match object; span=(34, 40), match="they'd">
<re.Match object; span=(41, 44), match="i'd">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(48, 54), match="didn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(25, 31), match="that's">
<re.Match object; span=(3, 11), match="wouldn't">
<re.Match object; span=(33, 39), match="hadn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(18, 24), match="you've">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(53, 61), match="wouldn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="who'd">
<re.Match object; span=(8, 13), match="you'd">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(43, 46), match="i'm">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(1, 8), match="there's">
<re.Match object; span=(45, 52), match="there's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(37, 41), match="i'll">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 37), match="shouldn't">
<re.Match object; span=(46, 53), match="they're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(35, 41), match="what's">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(95, 100), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 49), match="i'll">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(8, 14), match="you've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(47, 54), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(44, 50), match="you've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(58, 64), match="you'll">
<re.Match object; span=(315, 320), match="who's">
<re.Match object; span=(0, 6), match="hadn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(27, 31), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(23, 28), match="let's">
<re.Match object; span=(51, 57), match="that's">
<re.Match object; span=(44, 48), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(19, 26), match="haven't">
<re.Match object; span=(63, 69), match="didn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="we've">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(60, 67), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(30, 36), match="you'll">
<re.Match object; span=(71, 77), match="you'll">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(73, 80), match="doesn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(65, 70), match="can't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(54, 60), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(59, 63), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(67, 73), match="you'll">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(0, 6), match="hasn't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(53, 59), match="that's">
<re.Match object; span=(36, 41), match="won't">
<re.Match object; span=(11, 17), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(37, 40), match="i'd">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(52, 59), match="doesn't">
<re.Match object; span=(3, 11), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 11), match="shouldn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(41, 46), match="you'd">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(52, 57), match="can't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(69, 74), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(10, 14), match="he's">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(10, 16), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(41, 47), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(5, 8), match="i'd">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(46, 53), match="there's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(47, 53), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 36), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(42, 48), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(53, 61), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(43, 48), match="don't">
<re.Match object; span=(65, 70), match="don't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(15, 21), match="aren't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(63, 70), match="they'll">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(19, 25), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(9, 15), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(32, 39), match="doesn't">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(55, 61), match="you'll">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 18), match="won't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(35, 41), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(10, 17), match="there's">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(41, 47), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(26, 29), match="i'd">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(31, 40), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(65, 72), match="doesn't">
<re.Match object; span=(78, 83), match="won't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(60, 67), match="doesn't">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(44, 52), match="wouldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 44), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(45, 48), match="i'm">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(44, 50), match="you're">
<re.Match object; span=(8, 14), match="aren't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(10, 18), match="couldn't">
<re.Match object; span=(42, 48), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(31, 37), match="you've">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(36, 41), match="can't">
<re.Match object; span=(58, 64), match="that's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(75, 80), match="we're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(68, 74), match="that's">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(16, 22), match="aren't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(75, 79), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(3, 6), match="i'd">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(32, 39), match="doesn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(42, 48), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(85, 90), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(53, 58), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(12, 20), match="couldn't">
<re.Match object; span=(36, 42), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(51, 59), match="wouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(20, 26), match="what's">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(12, 18), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(14, 20), match="you'll">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(13, 20), match="there's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(39, 47), match="wouldn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(10, 13), match="i'm">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(7, 12), match="let's">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(10, 15), match="we're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(75, 82), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(19, 25), match="you'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(3, 9), match="what's">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(8, 11), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(13, 16), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(50, 56), match="that's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(0, 7), match="they'll">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(51, 54), match="i'm">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 43), match="there's">
<re.Match object; span=(24, 30), match="hasn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(86, 93), match="there's">
<re.Match object; span=(122, 129), match="there's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(4, 10), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(59, 66), match="doesn't">
<re.Match object; span=(49, 55), match="didn't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(77, 82), match="she's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(50, 53), match="i'm">
<re.Match object; span=(61, 64), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(45, 50), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(25, 32), match="weren't">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(17, 22), match="who's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(65, 71), match="what's">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(11, 17), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(86, 91), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(63, 67), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(34, 39), match="we've">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(153, 160), match="doesn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(36, 43), match="they're">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(41, 46), match="you'd">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(64, 68), match="he's">
<re.Match object; span=(113, 119), match="didn't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(18, 23), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(24, 31), match="haven't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(23, 28), match="you'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(18, 26), match="wouldn't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(10, 16), match="hasn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(18, 22), match="i'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(48, 54), match="you're">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(63, 69), match="you'll">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(28, 35), match="there's">
<re.Match object; span=(46, 52), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(29, 35), match="aren't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(38, 45), match="there's">
<re.Match object; span=(16, 21), match="let's">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(36, 41), match="we're">
<re.Match object; span=(40, 44), match="he'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(34, 39), match="we're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 20), match="weren't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(22, 29), match="haven't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(65, 70), match="isn't">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(17, 21), match="we'd">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(41, 48), match="haven't">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(12, 18), match="what's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(37, 43), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 24), match="isn't">
<re.Match object; span=(20, 26), match="hasn't">
<re.Match object; span=(13, 19), match="you'll">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(13, 17), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(27, 34), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(60, 66), match="didn't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(30, 33), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(48, 53), match="can't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(17, 21), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(6, 10), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(21, 29), match="couldn't">
<re.Match object; span=(12, 20), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(54, 58), match="i'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(22, 27), match="let's">
<re.Match object; span=(7, 13), match="you're">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(5, 9), match="i'll">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(54, 60), match="that's">
<re.Match object; span=(28, 33), match="you'd">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(35, 41), match="you'll">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(41, 47), match="hasn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(14, 19), match="you'd">
<re.Match object; span=(53, 58), match="you'd">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(20, 26), match="what's">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(25, 28), match="i'm">
<re.Match object; span=(98, 103), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(42, 48), match="that's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(23, 28), match="won't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(54, 59), match="won't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(18, 24), match="you've">
<re.Match object; span=(3, 11), match="wouldn't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(15, 23), match="wouldn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 17), match="shouldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(83, 88), match="won't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(39, 43), match="i'll">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(49, 55), match="you'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(64, 70), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(69, 73), match="i've">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(61, 67), match="you're">
<re.Match object; span=(24, 31), match="weren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 18), match="where's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(32, 38), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(13, 19), match="aren't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(72, 77), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 29), match="he'd">
<re.Match object; span=(52, 58), match="that's">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(47, 50), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(34, 38), match="he's">
<re.Match object; span=(55, 60), match="won't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(48, 52), match="i've">
<re.Match object; span=(67, 70), match="i'm">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(33, 37), match="he'd">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(27, 34), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(34, 40), match="that's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(27, 34), match="they're">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(15, 22), match="they're">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(75, 81), match="you're">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(38, 43), match="won't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(37, 46), match="shouldn't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(49, 57), match="wouldn't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(27, 35), match="couldn't">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(26, 31), match="won't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(71, 76), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(18, 26), match="wouldn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(16, 23), match="haven't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(34, 39), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 31), match="they'd">
<re.Match object; span=(14, 21), match="there's">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(24, 30), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(73, 80), match="doesn't">
<re.Match object; span=(20, 26), match="didn't">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(35, 39), match="i'll">
<re.Match object; span=(62, 68), match="what's">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(12, 17), match="isn't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(68, 76), match="wouldn't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(5, 12), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(17, 23), match="you'll">
<re.Match object; span=(31, 35), match="he's">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(44, 50), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(44, 47), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(68, 73), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(49, 54), match="can't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 35), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'd">
<re.Match object; span=(9, 14), match="we've">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(7, 14), match="weren't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(6, 10), match="i'll">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(1, 4), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(45, 50), match="won't">
<re.Match object; span=(153, 157), match="i've">
<re.Match object; span=(29, 35), match="aren't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(68, 73), match="can't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(40, 46), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(54, 60), match="what's">
<re.Match object; span=(111, 117), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(48, 51), match="i'm">
<re.Match object; span=(24, 29), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(39, 44), match="isn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(1, 6), match="we're">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(51, 56), match="we're">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(28, 33), match="isn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(6, 12), match="hasn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(28, 34), match="what's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(23, 29), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(63, 67), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(23, 29), match="hasn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(38, 44), match="you've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(17, 20), match="i'd">
<re.Match object; span=(15, 18), match="i'd">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(25, 32), match="doesn't">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 43), match="there's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(17, 21), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(14, 18), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(44, 53), match="shouldn't">
<re.Match object; span=(71, 77), match="you're">
<re.Match object; span=(90, 93), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(47, 50), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(68, 75), match="haven't">
<re.Match object; span=(21, 26), match="we're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(27, 36), match="shouldn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(37, 45), match="wouldn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(84, 90), match="that's">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(7, 13), match="you've">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(55, 60), match="won't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(45, 51), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 14), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(11, 18), match="there's">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(29, 35), match="you've">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(35, 43), match="couldn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(47, 53), match="hasn't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 50), match="they're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 13), match="there's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(122, 129), match="haven't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(38, 42), match="i've">
<re.Match object; span=(43, 50), match="weren't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(42, 45), match="i'm">
<re.Match object; span=(46, 52), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(27, 33), match="you'll">
<re.Match object; span=(10, 16), match="you'll">
<re.Match object; span=(41, 47), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(44, 50), match="what's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(33, 40), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(115, 119), match="i've">
<re.Match object; span=(31, 36), match="you'd">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 21), match="i'd">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(71, 77), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(28, 33), match="won't">
<re.Match object; span=(13, 18), match="won't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(20, 24), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(8, 12), match="i've">
<re.Match object; span=(47, 50), match="i'm">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(40, 46), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(62, 65), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(32, 38), match="what's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(46, 53), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(12, 19), match="there's">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(29, 34), match="we've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(13, 20), match="where's">
<re.Match object; span=(25, 31), match="didn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(49, 55), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(62, 66), match="i've">
<re.Match object; span=(32, 37), match="we've">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(46, 51), match="isn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(79, 85), match="that's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(35, 41), match="that's">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="that's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(50, 54), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(26, 32), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 39), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(3, 8), match="won't">
<re.Match object; span=(10, 17), match="they're">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 13), match="they'd">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(73, 77), match="he's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="you've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(46, 53), match="doesn't">
<re.Match object; span=(38, 44), match="didn't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(39, 45), match="you're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(47, 54), match="doesn't">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(41, 47), match="that's">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(45, 50), match="can't">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 33), match="i'd">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(22, 28), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 23), match="what're">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(26, 33), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(11, 16), match="we've">
<re.Match object; span=(36, 44), match="couldn't">
<re.Match object; span=(1, 5), match="i'll">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(8, 12), match="i'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(44, 49), match="isn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(33, 42), match="shouldn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(48, 54), match="that's">
<re.Match object; span=(66, 71), match="won't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(17, 23), match="hasn't">
<re.Match object; span=(48, 53), match="can't">
<re.Match object; span=(35, 41), match="what's">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(10, 14), match="i'll">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(42, 46), match="i'll">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(35, 41), match="you're">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(31, 38), match="there's">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(69, 75), match="that's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(12, 18), match="hasn't">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 28), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(29, 34), match="isn't">
<re.Match object; span=(42, 47), match="isn't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 32), match="weren't">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="weren't">
<re.Match object; span=(47, 53), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(117, 124), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(50, 57), match="haven't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(94, 100), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(50, 54), match="i'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(59, 63), match="i'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(71, 77), match="that's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(16, 21), match="we're">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 5), match="she's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(17, 22), match="isn't">
<re.Match object; span=(5, 13), match="wouldn't">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(50, 58), match="couldn't">
<re.Match object; span=(15, 19), match="we'd">
<re.Match object; span=(49, 55), match="aren't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(16, 21), match="won't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(25, 28), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(131, 138), match="they've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(4, 10), match="what's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(53, 62), match="shouldn't">
<re.Match object; span=(4, 9), match="isn't">
<re.Match object; span=(28, 33), match="won't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 35), match="they're">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(10, 14), match="i'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(16, 22), match="aren't">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(14, 20), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(7, 12), match="isn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(69, 72), match="i'm">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(64, 70), match="didn't">
<re.Match object; span=(126, 130), match="he's">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(7, 13), match="you're">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(4, 8), match="he'd">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(28, 36), match="wouldn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(9, 15), match="aren't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(24, 28), match="i'll">
<re.Match object; span=(4, 10), match="hasn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(43, 49), match="you've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(61, 67), match="didn't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(24, 27), match="i'd">
<re.Match object; span=(47, 52), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(37, 43), match="you're">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(3, 10), match="there's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(8, 13), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(43, 48), match="won't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(32, 37), match="who's">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(21, 26), match="can't">
<re.Match object; span=(94, 99), match="won't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(58, 64), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(32, 38), match="you're">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(17, 24), match="there's">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(60, 69), match="shouldn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(51, 56), match="you'd">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(24, 29), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 43), match="that's">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(103, 110), match="there's">
<re.Match object; span=(122, 125), match="i'm">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(113, 119), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(31, 38), match="there's">
<re.Match object; span=(26, 31), match="who's">
<re.Match object; span=(5, 11), match="hasn't">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="aren't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(15, 21), match="aren't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(43, 48), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(9, 14), match="who's">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(134, 139), match="he'll">
<re.Match object; span=(437, 442), match="she's">
<re.Match object; span=(18, 27), match="shouldn't">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 30), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 37), match="there's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(28, 35), match="they'll">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(91, 98), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(40, 44), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(27, 30), match="i'd">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(20, 25), match="we've">
<re.Match object; span=(3, 11), match="couldn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(25, 29), match="he's">
<re.Match object; span=(45, 52), match="doesn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(17, 23), match="you'll">
<re.Match object; span=(15, 21), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(26, 33), match="they're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(43, 49), match="aren't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(142, 146), match="i've">
<re.Match object; span=(39, 42), match="i'd">
<re.Match object; span=(10, 16), match="you've">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(57, 63), match="you'll">
<re.Match object; span=(44, 50), match="that's">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 12), match="i'd">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 15), match="they're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(17, 24), match="they're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(87, 93), match="you'll">
<re.Match object; span=(122, 128), match="you'll">
<re.Match object; span=(77, 81), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(11, 18), match="they're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(36, 42), match="you've">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(41, 48), match="they're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(48, 52), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(32, 37), match="won't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(71, 76), match="can't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(62, 65), match="i'm">
<re.Match object; span=(69, 72), match="i'm">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(45, 51), match="didn't">
<re.Match object; span=(39, 48), match="shouldn't">
<re.Match object; span=(4, 10), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(20, 28), match="wouldn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(14, 21), match="weren't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(55, 61), match="you're">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(50, 58), match="couldn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(62, 69), match="they're">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(45, 53), match="wouldn't">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(36, 41), match="can't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(60, 66), match="you'll">
<re.Match object; span=(91, 97), match="you're">
<re.Match object; span=(66, 73), match="doesn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(23, 26), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(14, 20), match="you'll">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(122, 131), match="shouldn't">
<re.Match object; span=(58, 63), match="you'd">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(11, 16), match="you'd">
<re.Match object; span=(50, 56), match="that's">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(49, 54), match="don't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(59, 64), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(3, 8), match="won't">
<re.Match object; span=(70, 76), match="what's">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(68, 72), match="i've">
<re.Match object; span=(8, 14), match="what's">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(13, 19), match="hasn't">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(60, 65), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(31, 36), match="who's">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(42, 49), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(59, 65), match="who'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(51, 54), match="i'm">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(29, 32), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 17), match="aren't">
<re.Match object; span=(1, 6), match="won't">
<re.Match object; span=(8, 16), match="couldn't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(29, 36), match="they're">
<re.Match object; span=(45, 52), match="doesn't">
<re.Match object; span=(41, 47), match="you've">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(23, 30), match="there's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(50, 58), match="wouldn't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(9, 15), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(70, 76), match="you're">
<re.Match object; span=(37, 40), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(20, 28), match="couldn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(28, 35), match="there's">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(12, 18), match="you've">
<re.Match object; span=(87, 93), match="aren't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(33, 39), match="didn't">
<re.Match object; span=(13, 16), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(28, 35), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(27, 31), match="he's">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(50, 53), match="i'm">
<re.Match object; span=(10, 18), match="wouldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(24, 29), match="isn't">
<re.Match object; span=(33, 39), match="that's">
<re.Match object; span=(6, 13), match="there's">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(70, 76), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(50, 54), match="i've">
<re.Match object; span=(35, 39), match="he's">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(5, 8), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(32, 37), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 50), match="isn't">
<re.Match object; span=(61, 67), match="that's">
<re.Match object; span=(21, 26), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 7), match="they'll">
<re.Match object; span=(36, 42), match="that's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(29, 34), match="isn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(39, 44), match="don't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(74, 80), match="that's">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(6, 11), match="who's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(17, 22), match="can't">
<re.Match object; span=(52, 58), match="that's">
<re.Match object; span=(30, 34), match="i'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(3, 9), match="that's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(67, 74), match="there's">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(7, 13), match="you'll">
<re.Match object; span=(59, 65), match="you'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(63, 68), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(40, 47), match="doesn't">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="we're">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(48, 51), match="i'm">
<re.Match object; span=(28, 34), match="aren't">
<re.Match object; span=(38, 42), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(32, 39), match="haven't">
<re.Match object; span=(43, 52), match="shouldn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(11, 15), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(13, 18), match="you'd">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(19, 24), match="won't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(43, 51), match="wouldn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(7, 13), match="what's">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 26), match="i'll">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(49, 55), match="that's">
<re.Match object; span=(30, 38), match="couldn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(48, 53), match="won't">
<re.Match object; span=(23, 28), match="you'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(86, 94), match="wouldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(52, 58), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(17, 24), match="they're">
<re.Match object; span=(43, 50), match="they're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(50, 55), match="isn't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(70, 75), match="isn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(28, 34), match="hadn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(12, 16), match="i'll">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(44, 51), match="doesn't">
<re.Match object; span=(61, 66), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(40, 44), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(33, 39), match="aren't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(33, 40), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 36), match="you've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(73, 78), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(24, 29), match="she's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(8, 14), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(8, 13), match="she's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="that's">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(141, 146), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(68, 73), match="don't">
<re.Match object; span=(20, 23), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(19, 25), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(5, 9), match="i'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(74, 79), match="who's">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(34, 38), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(53, 56), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(64, 70), match="that's">
<re.Match object; span=(105, 110), match="won't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(99, 104), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(57, 62), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 29), match="you've">
<re.Match object; span=(49, 55), match="you'll">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(1, 7), match="that's">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(74, 81), match="doesn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(34, 40), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(101, 108), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(82, 89), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(21, 25), match="he'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(44, 48), match="i've">
<re.Match object; span=(25, 29), match="i'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(51, 59), match="couldn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(19, 24), match="won't">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(45, 50), match="isn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 24), match="couldn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 13), match="aren't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(14, 17), match="i'd">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(46, 53), match="doesn't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(48, 56), match="couldn't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(66, 74), match="wouldn't">
<re.Match object; span=(26, 32), match="aren't">
<re.Match object; span=(18, 23), match="she's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(14, 19), match="won't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(7, 12), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(12, 17), match="we've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(33, 40), match="where's">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 17), match="that's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(65, 71), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(54, 58), match="he's">
<re.Match object; span=(31, 36), match="isn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(31, 37), match="what's">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i'll">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(29, 35), match="didn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(45, 49), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 13), match="i'll">
<re.Match object; span=(15, 23), match="wouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(26, 32), match="you've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(51, 57), match="you've">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(36, 42), match="you've">
<re.Match object; span=(91, 97), match="didn't">
<re.Match object; span=(67, 74), match="doesn't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="isn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 18), match="they're">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(63, 68), match="don't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(19, 25), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(74, 78), match="i've">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(46, 52), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(18, 24), match="you'll">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(22, 28), match="hasn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="isn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(16, 24), match="couldn't">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(32, 38), match="you'll">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(47, 52), match="won't">
<re.Match object; span=(10, 13), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(19, 25), match="you've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(45, 51), match="you'll">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(58, 63), match="we're">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(17, 23), match="hadn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(64, 70), match="who've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(34, 41), match="weren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(5, 10), match="who's">
<re.Match object; span=(48, 54), match="that's">
<re.Match object; span=(91, 95), match="i've">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(14, 22), match="couldn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(40, 47), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(41, 48), match="doesn't">
<re.Match object; span=(118, 123), match="don't">
<re.Match object; span=(91, 98), match="doesn't">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(28, 36), match="wouldn't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(35, 38), match="i'm">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(29, 36), match="haven't">
<re.Match object; span=(145, 151), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(52, 57), match="won't">
<re.Match object; span=(27, 31), match="we'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="he'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(46, 50), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 12), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(38, 41), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 18), match="i'll">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(5, 10), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(69, 75), match="you've">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(29, 34), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(30, 36), match="you've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(33, 40), match="they're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(85, 92), match="they're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(8, 15), match="they're">
<re.Match object; span=(34, 41), match="they're">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(61, 65), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(41, 44), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="there's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(87, 92), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(40, 46), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="aren't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 19), match="what's">
<re.Match object; span=(17, 22), match="who's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(26, 32), match="you've">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(18, 25), match="they're">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(58, 63), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 34), match="i'll">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(52, 56), match="i'll">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(13, 19), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 34), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 21), match="won't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(47, 52), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(60, 63), match="i'm">
<re.Match object; span=(33, 39), match="hadn't">
<re.Match object; span=(34, 40), match="you've">
<re.Match object; span=(24, 31), match="where's">
<re.Match object; span=(6, 14), match="couldn't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 41), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(1, 7), match="didn't">
<re.Match object; span=(31, 38), match="doesn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(31, 36), match="we've">
<re.Match object; span=(51, 56), match="we've">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(43, 48), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(63, 68), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 33), match="what's">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(25, 31), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(46, 53), match="weren't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(44, 47), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i'll">
<re.Match object; span=(23, 27), match="i'll">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(13, 19), match="you'll">
<re.Match object; span=(87, 93), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(31, 34), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="isn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(74, 80), match="that's">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(61, 66), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(22, 31), match="shouldn't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(48, 54), match="you'll">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(65, 71), match="you've">
<re.Match object; span=(38, 42), match="he's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(6, 14), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(52, 57), match="isn't">
<re.Match object; span=(7, 13), match="that's">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(33, 39), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(5, 11), match="you've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(51, 54), match="i'd">
<re.Match object; span=(41, 48), match="doesn't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(70, 76), match="what's">
<re.Match object; span=(33, 40), match="where's">
<re.Match object; span=(121, 127), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 51), match="wouldn't">
<re.Match object; span=(23, 32), match="shouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(21, 29), match="wouldn't">
<re.Match object; span=(79, 84), match="you'd">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(37, 40), match="i'm">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(96, 101), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(86, 91), match="can't">
<re.Match object; span=(49, 55), match="didn't">
<re.Match object; span=(52, 57), match="won't">
<re.Match object; span=(50, 57), match="where's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(125, 130), match="let's">
<re.Match object; span=(40, 48), match="couldn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(18, 23), match="let's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(44, 49), match="let's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(52, 56), match="i've">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 28), match="let's">
<re.Match object; span=(37, 42), match="she's">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(103, 107), match="he's">
<re.Match object; span=(9, 14), match="let's">
<re.Match object; span=(8, 13), match="let's">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(77, 81), match="i've">
<re.Match object; span=(50, 55), match="can't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(73, 78), match="won't">
<re.Match object; span=(7, 10), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(40, 46), match="didn't">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(12, 18), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(22, 29), match="they'll">
<re.Match object; span=(54, 59), match="isn't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(7, 13), match="hasn't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(29, 36), match="haven't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(13, 20), match="haven't">
<re.Match object; span=(5, 10), match="won't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(17, 22), match="can't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(40, 47), match="haven't">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(44, 47), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(34, 37), match="i'd">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(63, 68), match="can't">
<re.Match object; span=(89, 93), match="he's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(41, 47), match="you've">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(19, 26), match="there's">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(47, 54), match="there's">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(44, 49), match="you'd">
<re.Match object; span=(60, 65), match="isn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(35, 43), match="couldn't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(35, 42), match="there's">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(69, 76), match="they've">
<re.Match object; span=(41, 46), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(53, 58), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="who's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(43, 49), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(20, 28), match="wouldn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(4, 13), match="shouldn't">
<re.Match object; span=(34, 43), match="shouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(32, 35), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(7, 13), match="that's">
<re.Match object; span=(30, 36), match="you'll">
<re.Match object; span=(13, 19), match="you're">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(14, 20), match="you've">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(25, 33), match="couldn't">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(28, 33), match="who's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(50, 57), match="they're">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(12, 18), match="aren't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(38, 43), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(33, 40), match="there's">
<re.Match object; span=(136, 139), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(74, 79), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(29, 36), match="there's">
<re.Match object; span=(4, 11), match="haven't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(33, 39), match="you'll">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 19), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(47, 53), match="aren't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(23, 29), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(17, 24), match="there's">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(29, 36), match="there's">
<re.Match object; span=(26, 33), match="there's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(87, 93), match="you'll">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(22, 27), match="we've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(30, 37), match="there's">
<re.Match object; span=(46, 53), match="they're">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(27, 33), match="you'll">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 23), match="they've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(39, 44), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(57, 61), match="i've">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(11, 17), match="aren't">
<re.Match object; span=(117, 122), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 18), match="there's">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(84, 90), match="you'll">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(36, 45), match="shouldn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(12, 21), match="shouldn't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(57, 64), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(95, 103), match="wouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(126, 132), match="didn't">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(65, 71), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(64, 70), match="didn't">
<re.Match object; span=(20, 26), match="didn't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(5, 13), match="wouldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(26, 32), match="what's">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(32, 35), match="i'm">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(42, 45), match="i'm">
<re.Match object; span=(64, 67), match="i'm">
<re.Match object; span=(90, 93), match="i'm">
<re.Match object; span=(37, 44), match="what'll">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(57, 66), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 25), match="there's">
<re.Match object; span=(41, 48), match="there's">
<re.Match object; span=(44, 49), match="isn't">
<re.Match object; span=(52, 59), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 28), match="i'm">
<re.Match object; span=(19, 26), match="haven't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(52, 57), match="you'd">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(66, 71), match="isn't">
<re.Match object; span=(53, 59), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(31, 36), match="we've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(48, 54), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(5, 12), match="there's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(58, 63), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(32, 35), match="i'd">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(51, 57), match="you've">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 41), match="i'm">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(89, 96), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 13), match="won't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(22, 28), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(14, 17), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(74, 80), match="you're">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(114, 118), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(35, 40), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 49), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(55, 62), match="doesn't">
<re.Match object; span=(15, 20), match="we're">
<re.Match object; span=(40, 45), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(24, 27), match="i'd">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(20, 25), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(8, 13), match="can't">
<re.Match object; span=(57, 63), match="you'll">
<re.Match object; span=(77, 83), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(30, 33), match="i'm">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 23), match="i'd">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(4, 11), match="haven't">
<re.Match object; span=(27, 33), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 25), match="there's">
<re.Match object; span=(89, 93), match="he's">
<re.Match object; span=(3, 8), match="we're">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(62, 66), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(67, 72), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(33, 39), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 28), match="he's">
<re.Match object; span=(43, 48), match="she's">
<re.Match object; span=(110, 117), match="they're">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(3, 7), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(21, 27), match="what's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(55, 62), match="they're">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(19, 23), match="he's">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 13), match="she's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(8, 15), match="where's">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(34, 41), match="there's">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(20, 25), match="we're">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(53, 59), match="didn't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(73, 76), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(5, 12), match="weren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="aren't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(17, 23), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(43, 50), match="doesn't">
<re.Match object; span=(77, 82), match="isn't">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(37, 40), match="i'd">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(148, 152), match="i'll">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(8, 13), match="we're">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(35, 39), match="i'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(7, 13), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(42, 45), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 14), match="there's">
<re.Match object; span=(22, 27), match="won't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(8, 16), match="couldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(46, 53), match="there's">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(54, 59), match="won't">
<re.Match object; span=(60, 65), match="don't">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(98, 104), match="hasn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 49), match="hadn't">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(29, 35), match="didn't">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(26, 35), match="shouldn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(60, 66), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(12, 18), match="hadn't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(8, 14), match="hadn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(48, 53), match="isn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 6), match="they'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 8), match="i've">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(54, 61), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(6, 11), match="we're">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(48, 54), match="hasn't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(36, 41), match="isn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(26, 31), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(26, 34), match="couldn't">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(49, 56), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(70, 77), match="doesn't">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(56, 62), match="didn't">
<re.Match object; span=(69, 72), match="i'd">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(23, 27), match="he's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(8, 15), match="there's">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(17, 25), match="couldn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(24, 31), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(12, 16), match="he's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(1, 6), match="we're">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(80, 86), match="that's">
<re.Match object; span=(72, 77), match="isn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(72, 78), match="didn't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(4, 11), match="where's">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(64, 70), match="that's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(128, 132), match="he's">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(7, 12), match="let's">
<re.Match object; span=(71, 77), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(45, 51), match="you've">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(59, 66), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(44, 52), match="couldn't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(63, 70), match="doesn't">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 3), match="i'm">
In [ ]:
expand_contractions("doesn't react at all", contractions)
Out[ ]:
'does not react at all'
In [ ]:
movie_reviews.head(1)['plot_synopsis'].item()
Out[ ]:
"jack ryan ford is on a working vacation in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she is impressed when the guard, trained to ignore distraction, does not react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists kevin o'donnell, the driver, as well as sean miller sean bean and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaves the palace, the terrorists' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan sees what is going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean's pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller's ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller's motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller's comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he is leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warrantedas he continues walking down the street, he looks in the reflection of a parked van's rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he is walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he is shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he is trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller's van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy's car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes headon into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally."

Experiments with lemmatization

In [ ]:
import nltk
nltk.download('wordnet')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
Out[ ]:
True
In [ ]:
from nltk.stem import WordNetLemmatizer

text = movie_reviews.head(1)['plot_synopsis'].item()
lemmatizer = WordNetLemmatizer()
text_l1 = ' '.join([lemmatizer.lemmatize(word) for word in text.split()])
In [ ]:
print(text_l1)
jack ryan ford is on a working vacation in london with his family. he ha retired from the cia and is a professor at the u naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally try to get the guard to react by doing an improvised tap dance in front of him. she is impressed when the guard, trained to ignore distraction, doe not react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorist kevin o'donnell, the driver, a well a sean miller sean bean and his younger brother patrick. the three are loading bullet into their gun a they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. a they finish loading their weapons, another terrorist, annette, radio to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaf the palace, the terrorists' car fall in line behind him. they follow the car, headed for the ambush point. a they drive, o'donnell and the miller brother don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pull out and swerve in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerve to a stop behind the car. the miller and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan see what is going on and hastily get his family to cover just a the bomb is detonated. moment later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullet doe not kill lord holmes, who is currently using his body to shield his family. sean march over to the rear door and order lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately break cover, run up behind miller, and disarms him. a ryan grab sean's pistol, he get shot in the left shoulder by another terrorist. he shoot one of the other terrorist in the shoulder. patrick attempt to flee, but ryan fatally shoot him in the chest. seeing two of the royal guard racing towards the scene, the remaining terrorist quickly get back in their car and drive off, leaving sean to be captured when the authority arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller a a witness. subsequently, miller is convicted on all charge and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually return to the united states.on the possibility that miller's ula comrade might try to liberate him, the authority take no chances. to that extent, they set up decoy convoy on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula a to which convoy is carrying miller. miller's motorcade is in transit when it come to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moment after the convoy come to a stop, miller's comrade attack, using rocket launcher to blow up the car at the front and rear of the convoy. they then march up to the prison van and order the guard out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guard hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hand miller a pistol, with which he coldly executes both officer and the bridge operator by shooting them in the back of their heads. they then depart the scene.miller and his companion flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several member of his entourage to accompany him to the united state on a short mission targeting ryan and his family.miller travel to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united state naval academy to ambush ryan a he is leaving work. ryan notice clark idling nearby. clark casually walk away, but ryan clearly unnerved, then get even more nervous a he hears a car engine starting. his suspicion are well warrantedas he continues walking down the street, he look in the reflection of a parked van's rearview door and see that clark, and the stolen car driven by annette, are following him. ryan pretend not to see them until he is walking past a parked panel truck, at which point he duck behind the truck. clark draw a silenced pistol and prepares to corner ryan, but ryan attack him first. clark quickly get the upper hand and throw ryan to the curb. he grab his pistol and is about to shoot ryan when he is shot dead himself by a u marine sentry. annette quickly drive away in her jeep. ryan, realizing that miller is going to target cathy and sally, jump into his car and drive off.simultaneously, miller and the other henchman follow cathy a she pick up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to u highway 50. ryan try to frantically call cathy on his car phone, but she is busy answering phone call from the hospital. a he is trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually get through to cathy and tell her to get off the highway and find the nearest police station. unfortunately, just a the call ends, miller's van come charging up on her using the left shoulder. miller move to the backseat of the van, grab a submachine gun, and slide open the door. another car momentarily delay him, but once he ha a clear aim, miller open fire, spraying cathy's car with bullets. a multicar pileup ensues a cathy loses control of her car and crash headon into a concrete barrier. ryan, who ha found himself stuck in traffic headed in the other direction, see the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work lead him to conclude that miller ha taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan make his recommendation to his superior at the cia. greer invite jack to a live feed of an sa strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companion have already fled the camp and are on their way to the u to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group track holmes to this location, kill the diplomatic security service agent and maryland state trooper guarding the house, and attempt once more to kidnap lord holmes. ryan lead holmes and his family to safety while he attempt to lure miller and his companion away from his home.the fbi hostage rescue team are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companion try to persuade miller to turn around, but an enraged and deranged miller kill his terrorist companion and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tell jack and sally.
In [ ]:
from nltk.corpus import wordnet

def get_wordnet_pos(word):
    """Map POS tag to first character lemmatize() accepts"""
    tag = nltk.pos_tag([word])[0][1][0].upper()
    tag_dict = {"J": wordnet.ADJ,
                "N": wordnet.NOUN,
                "V": wordnet.VERB,
                "R": wordnet.ADV}
    return tag_dict.get(tag, wordnet.NOUN)

text_l2 = ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(text)])
In [ ]:
print(text_l2)
jack ryan ford be on a work vacation in london with his family . he have retire from the cia and be a professor at the u naval academy . he be see deliver a lecture at the royal naval academy in london.meanwhile , ryan 's wife cathy and daughter sally be sightsee near buckingham palace . sally and cathy come upon a british royal guard , and sally try to get the guard to react by do an improvised tap dance in front of him . she be impressed when the guard , train to ignore distraction , do not react at all , and they leave.as sally and cathy walk away from the guard , en route to rendezvous with ryan , they walk by a steal cab , in which sit three ulster liberation army terrorist kevin o'donnell , the driver , a well a sean miller sean bean and his young brother patrick . the three be load bullet into their gun a they prepare to carry out a schedule ambush on lord william holmes , british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother . a they finish load their weapon , another terrorist , annette , radio to them that lord holmes be leave the palace with his wife and son.as lord holmes 's car leaf the palace , the terrorist ' car fall in line behind him . they follow the car , head for the ambush point . a they drive , o'donnell and the miller brother don ski masks.meanwhile , ryan be about to cross the street to meet with his wife and daughter . he arrives at the same time that lord holmes 's car be passing through the area.suddenly , another cab , park on the side of the road , driven by annette , pull out and swerve in front of lord holmes 's vehicle . simultaneously , o'donnell 's vehicle swerve to a stop behind the car . the miller and another accomplice jump out and quickly plant a bomb underneath the chauffeur car 's engine . ryan see what be go on and hastily get his family to cover just a the bomb be detonate . moment later , sean and another accomplice begin shoot into the car , riddle lord holmes 's driver and bodyguard with bullet . the onslaught of bullet do not kill lord holmes , who be currently use his body to shield his family . sean march over to the rear door and order lord holmes out of the car at gunpoint . ryan , unable to stand be just a bystander , immediately break cover , run up behind miller , and disarms him . a ryan grab sean 's pistol , he get shot in the left shoulder by another terrorist . he shoot one of the other terrorist in the shoulder . patrick attempt to flee , but ryan fatally shoot him in the chest . see two of the royal guard race towards the scene , the remain terrorist quickly get back in their car and drive off , leave sean to be capture when the authority arrive . they retreat back to a warehouse , where they change out license plates.while recover , ryan be call to testify against miller a a witness . subsequently , miller be convict on all charge and sentence to life in prison . ryan be award the order of knight commander of the victorian order , and eventually return to the united states.on the possibility that miller 's ula comrade might try to liberate him , the authority take no chance . to that extent , they set up decoy convoy on the day that miller be to be transfer to albany prison on the isle of wight . however , someone manages to tip off the ula a to which convoy be carry miller . miller 's motorcade be in transit when it come to a stop at a drawbridge . the raise bridge , however , be a trap meant to block them in . moment after the convoy come to a stop , miller 's comrade attack , use rocket launcher to blow up the car at the front and rear of the convoy . they then march up to the prison van and order the guard out at gunpoint by take the bridge operator hostage and threaten to shoot him . one of the guard hesitates and be shot fatally . the other officer and the inspector be pull out , thrown on the ground and force to lie on their stomach . o'donnell hand miller a pistol , with which he coldly executes both officer and the bridge operator by shoot them in the back of their head . they then depart the scene.miller and his companion flee on a cargo ship from britain to libya , to prepare for their next kidnapping attempt on lord holmes . miller however , can not shake his anger towards ryan for kill his young brother and persuades several member of his entourage to accompany him to the united state on a short mission target ryan and his family.miller travel to the united state , accompany by annette , comrade ned clark , and a fourth henchman . annette and clark travel to the united state naval academy to ambush ryan a he be leave work . ryan notice clark idle nearby . clark casually walk away , but ryan clearly unnerved , then get even more nervous a he hears a car engine start . his suspicion be well warrantedas he continue walk down the street , he look in the reflection of a park van 's rearview door and see that clark , and the steal car driven by annette , be follow him . ryan pretend not to see them until he be walk past a park panel truck , at which point he duck behind the truck . clark draw a silence pistol and prepares to corner ryan , but ryan attack him first . clark quickly get the upper hand and throw ryan to the curb . he grab his pistol and be about to shoot ryan when he be shot dead himself by a u marine sentry . annette quickly drive away in her jeep . ryan , realize that miller be go to target cathy and sally , jump into his car and drive off.simultaneously , miller and the other henchman follow cathy a she pick up sally from school , with miller rid shotgun . they prepare to follow the two , but be temporarily held up by a cross guard . once the cross guard clear , they pursue her to u highway 50. ryan try to frantically call cathy on his car phone , but she be busy answer phone call from the hospital . a he be try to connect with her , miller and his henchman recklessly weave in and out of traffic to catch up with cathy . ryan eventually get through to cathy and tell her to get off the highway and find the near police station . unfortunately , just a the call end , miller 's van come charge up on her use the left shoulder . miller move to the backseat of the van , grab a submachine gun , and slide open the door . another car momentarily delay him , but once he have a clear aim , miller open fire , spray cathy 's car with bullet . a multicar pileup ensues a cathy loses control of her car and crash headon into a concrete barrier . ryan , who have found himself stuck in traffic head in the other direction , see the smoke from the crash and realizes the worst.cathy and sally be airlift to a nearby hospital for treatment . enrage over the nearloss of his family , ryan decides to go back to work for the cia , have earlier reject the appeal of his former superior , vice admiral james greer james earl jones.ryan 's tireless work lead him to conclude that miller have take refuge in a training camp , one of many locate in libya . he also determines that a recent new arrival at the camp may be one of o'donnell 's english informant , dennis cooley . ryan make his recommendation to his superior at the cia . greer invite jack to a live feed of an sa strike team attack on the camp . everyone in the camp be eliminate but unbeknownst to the cia and ryan , miller and his companion have already flee the camp and be on their way to the u to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo . with the aid of lord holmes ' traitorous assistant , miller 's group track holmes to this location , kill the diplomatic security service agent and maryland state trooper guard the house , and attempt once more to kidnap lord holmes . ryan lead holmes and his family to safety while he attempt to lure miller and his companion away from his home.the fbi hostage rescue team be scramble to pick up holmes . upon realise that ryan be lead them away from holmes , miller 's companion try to persuade miller to turn around , but an enrage and derange miller kill his terrorist companion and continue his pursuit of ryan . ryan and miller fight hand to hand miller be kill when ryan impales him backward on a boat anchor , and his body be obliterate in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she be go to have , and before she tell jack and sally .
In [ ]:
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
In [ ]:
from transformers import AutoTokenizer

# Specify the model name
model_name = "bert-base-uncased"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
In [ ]:
print(len(tokenizer.tokenize(text, add_special_tokens=False, truncation=False)))
print(len(tokenizer.tokenize(text_l1, add_special_tokens=False, truncation=False)))
print(len(tokenizer.tokenize(text_l2, add_special_tokens=False, truncation=False)))
1797
1794
1791
In [ ]:
print(len(set(tokenizer.tokenize(text, add_special_tokens=False, truncation=False))))
print(len(set(tokenizer.tokenize(text_l1, add_special_tokens=False, truncation=False))))
print(len(set(tokenizer.tokenize(text_l2, add_special_tokens=False, truncation=False))))
615
584
545

Lemmatization did not shorten the number of tokens significantly. The number of words in vocabulary decreased. For now saving this without lemmaiaion. For models such as BERT lemmatiation is not required. However, for LSTM, lemmatization is needed. Once we persist the pre-processed textx without lemmatization, we will lemmatize them as required in this notebook.

In [ ]:
movie_reviews['review_summary_num_tokens']=movie_reviews['review_summary'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['review_text_num_tokens']= movie_reviews['review_text'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['plot_synopsis_num_tokens']= movie_reviews['plot_synopsis'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['plot_summary_num_tokens']= movie_reviews['plot_summary'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
In [ ]:
# Write the resulting data frame as csv for later use

import csv

csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")

movie_reviews.to_csv(csv_file_path, index=False)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-989cec618348> in <cell line: 5>()
      3 import csv
      4 
----> 5 csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")
      6 
      7 movie_reviews.to_csv(csv_file_path, index=False)

NameError: name 'os' is not defined

EDA

In [ ]:
import csv
import os
import pandas as pd

dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")

csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")

movie_reviews = pd.read_csv(csv_file_path)
In [ ]:
movie_reviews
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge review_summary_num_tokens review_text_num_tokens plot_synopsis_num_tokens plot_summary_num_tokens
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 5 March 2008 ur16517420 True the second tom clancy novel made into a film '... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both 3 431 1797 217
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 October 2007 ur13887584 True the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both 4 129 1797 217
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 16 September 2013 ur2403537 True i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both 20 401 1797 217
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 May 2007 ur0819382 True jack ryan harrison ford is a cia analyst who s... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both 3 1045 1797 217
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 29 February 2004 ur2567338 True this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both 7 236 1797 217
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
538823 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur78983408 False dunkirk is a beautifully done movie that has h... 10 possibly the greatest war film ever made ef14e72a-77cd-4b9c-9f85-37d01c55fded both 7 175 1567 52
538824 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 2 August 2017 ur53645293 False dunkirk is one of the rare cases a film receiv... 10 why this could actually be one of the best war... d4bfc31d-ae8e-4e48-b616-936132153630 both 16 840 1567 52
538825 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 31 July 2017 ur45202903 False film gave insufficient background on what was ... 2 disappointed e47d2365-78d9-44c1-86e8-9c02d190b0cb both 1 57 1567 52
538826 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 28 July 2017 ur1319404 False in screen writing, a shot is an image captured... 7 dunkirk for millennials d985d458-0b1b-4cec-a5d2-6e9431770ce0 both 4 146 1567 52
538827 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 25 July 2017 ur4161069 False in a movie that entirely engulfs you it's rath... 10 christopher nolan reaches yet another new high... 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 both 12 345 1567 52

538828 rows × 19 columns

In [ ]:
import seaborn as sns
import matplotlib.pyplot as plt

Hist plot of number of tokens in review_text

In [ ]:
# If there is correlation betwen num tokens in review_text and is_spoiler
sns.histplot(x='review_text_num_tokens', hue='is_spoiler', data=movie_reviews)
Out[ ]:
<Axes: xlabel='review_text_num_tokens', ylabel='Count'>
No description has been provided for this image
In [ ]:
print(movie_reviews['review_text_num_tokens'].value_counts().sort_values(ascending=False))
review_text_num_tokens
156     2827
147     2802
158     2796
155     2785
151     2785
        ... 
1778       1
2830       1
2293       1
1871       1
1536       1
Name: count, Length: 1614, dtype: int64
In [ ]:
movie_reviews['review_text_num_tokens'].mode()
Out[ ]:
0    156
Name: review_text_num_tokens, dtype: int64

Histplot of all quantitative features

In [ ]:
# Split Quantitative from Categorical
x_quantitative = ['rating_x', 'rating_y', 'review_text_num_tokens', 'review_summary_num_tokens','plot_synopsis_num_tokens', 'plot_summary_num_tokens']
y = ['is_spoiler']
#x_categorical = [feature for feature in movie_reviews.columns if ((feature not in x_quantitative) and (feature not in y)) ]
#x_categorical
In [ ]:
def hist_plots(data, columns, rem_ol=False, thres=0.99, scale_graph=9, n_cols=3, aspect_ratio=2/3 ):
    '''Create multiple Histograms plots using a subset of variables specified.

    Args:
        data: Input data-frame containing variables we wish to plot.
        columns: Listing of column-names we wish to plot.
        rem_ol: Remove observations greater than specific percentile defined by thres argument.
        thres: Percentile that will be used if rem_ol=True.
        scale_graph: Adjust the total size of the graph.
        n_cols: Adjust how many graphs we have on each row.
        aspect_ratio: Adjust the aspect ratio of each individual graph. For squared graphs use 1/1.
    '''

    # Adjusting how many rows the grid will have and proper sizes
    n_rows = len(columns)//n_cols+(len(columns)%n_cols>0)
    fig, axes = plt.subplots(n_rows, n_cols, figsize=(scale_graph, (scale_graph/n_cols)*aspect_ratio*n_rows))

    # Plotting
    fig.suptitle(f'Histograms of {len(columns)} columns',y=1, size=15)
    axes=axes.flatten()
    for i,feature in enumerate(columns):
        if rem_ol:
            lim = data[feature].quantile([thres]).iloc[0]
            condition = data[feature]<lim
            x = data.loc[condition, [feature, 'is_spoiler']]
            print(f'{feature}: Observations greater than P{round(thres*100)} removed')
            print(x.columns)
        else:
            x=data[[feature, 'is_spoiler']]
            print(x.columns)
        sns.histplot(x = feature, data=x, ax=axes[i], hue = 'is_spoiler');
        axes[i].set_title(feature, size=10);
    plt.tight_layout()
In [ ]:
hist_plots(movie_reviews,x_quantitative, n_cols=2)
Index(['rating_x', 'is_spoiler'], dtype='object')
Index(['rating_y', 'is_spoiler'], dtype='object')
Index(['review_text_num_tokens', 'is_spoiler'], dtype='object')
Index(['review_summary_num_tokens', 'is_spoiler'], dtype='object')
Index(['plot_synopsis_num_tokens', 'is_spoiler'], dtype='object')
Index(['plot_summary_num_tokens', 'is_spoiler'], dtype='object')
No description has been provided for this image

Th number of tokens in plot_summary looks more reasonable than in plot_synopsis. We should try training the model with plot_summary as well as plot_synopsis.

As number of tokens increases in review_text, the more likely it is to be a spoiler. There are few reviews with number of tokens below 30.

Relationship between genres and is_spoiler.

In [ ]:
movie_reviews['genre'].describe()
Out[ ]:
count                                538828
unique                                  218
top       ['Action', 'Adventure', 'Sci-Fi']
freq                                  39020
Name: genre, dtype: object
In [ ]:
movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(','))
movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else [])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-22-e7d38f67f309> in <cell line: 1>()
----> 1 movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(','))
      2 movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else [])

/usr/local/lib/python3.10/dist-packages/pandas/core/series.py in map(self, arg, na_action)
   4395         dtype: object
   4396         """
-> 4397         new_values = self._map_values(arg, na_action=na_action)
   4398         return self._constructor(new_values, index=self.index, copy=False).__finalize__(
   4399             self, method="map"

/usr/local/lib/python3.10/dist-packages/pandas/core/base.py in _map_values(self, mapper, na_action)
    922 
    923         # mapper is a function
--> 924         new_values = map_f(values, mapper)
    925 
    926         return new_values

/usr/local/lib/python3.10/dist-packages/pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-22-e7d38f67f309> in <lambda>(x)
----> 1 movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(','))
      2 movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else [])

AttributeError: 'list' object has no attribute 'split'
In [ ]:
movie_reviews['genre'].describe()
Out[ ]:
count                                    538828
unique                                      218
top       [['Action',  'Adventure',  'Sci-Fi']]
freq                                      39020
Name: genre, dtype: object
In [ ]:
genres_spoilers = movie_reviews[['genre', 'is_spoiler']]
genres_spoilers = genres_spoilers.explode('genre')
#genres_spoilers = genres_spoilers.groupby(['genre', 'is_spoiler']).size().reset_index(name='count')
genres_spoilers
Out[ ]:
genre is_spoiler
0 ['Action' True
0 'Thriller'] True
1 ['Action' True
1 'Thriller'] True
2 ['Action' True
... ... ...
538826 'Drama' False
538826 'History'] False
538827 ['Action' False
538827 'Drama' False
538827 'History'] False

1430664 rows × 2 columns

In [ ]:
sns.countplot(data=genres_spoilers, x='genre', hue='is_spoiler')

# Rotate x-axis labels by 90 degrees
plt.xticks(rotation=90)

# Show the plot
plt.show()
No description has been provided for this image

A portion of all genres seems to be spoilers. No clear correlation found.

Deep Learning training on set of 10000 positive and 10000 negative examples

Take a sample of 10000 rows and test for review-Text and plot_summary

In [ ]:
import csv
import os
import pandas as pd

dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")

csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")

movie_reviews = pd.read_csv(csv_file_path)
In [ ]:
# Convert boolean is_spoiler to 0 and 1.
movie_reviews['is_spoiler'] = movie_reviews['is_spoiler'].astype(int)
In [ ]:
movie_reviews = movie_reviews[movie_reviews['review_text_num_tokens'] >= 50]
In [ ]:
movie_reviews['sum num tokens review_text plus plot summary'] = movie_reviews['review_text_num_tokens'] + movie_reviews['plot_summary_num_tokens']
In [ ]:
import seaborn as sns
import matplotlib.pyplot as plt
sns.histplot(x='sum num tokens review_text plus plot summary', hue='is_spoiler', data=movie_reviews)
Out[ ]:
<Axes: xlabel='sum num tokens review_text plus plot summary', ylabel='Count'>
No description has been provided for this image

Using a longfornmer where a seq length of 4096 will allow us to process both review text and plot_summary, and all of it as we don;t see anything going beyond that limit where the combined lengths of plot_summary and review text is beyond 4096.

In [ ]:
movie_reviews[movie_reviews['is_spoiler'] == 1]
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge review_summary_num_tokens review_text_num_tokens plot_synopsis_num_tokens plot_summary_num_tokens sum num tokens review_text plus plot summary
0 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 5 March 2008 ur16517420 1 the second tom clancy novel made into a film '... 6 decent clancy thriller bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 both 3 431 1797 217 648
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 October 2007 ur13887584 1 the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both 4 129 1797 217 346
2 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 16 September 2013 ur2403537 1 i was not a fan of the hunt for red october. i... 8 it's filled to the brim with excitement and it... b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 both 20 401 1797 217 618
3 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 May 2007 ur0819382 1 jack ryan harrison ford is a cia analyst who s... 7 patriotic gore. c657363f-224d-48db-a029-8e4ddd2b248d both 3 1045 1797 217 1262
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 29 February 2004 ur2567338 1 this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both 7 236 1797 217 453
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
538653 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 21 July 2017 ur32188727 1 this is a nolan film and like every other nola... 7 a visual feast and amazing film. 05852db1-f196-4ee7-9ec8-18fbb7b5e963 both 7 589 1567 52 641
538654 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 24 July 2017 ur30392044 1 world war ii movies are a genre that until ver... 9 steals pure cinema back from the grave dug by ... 038eeaa9-a1c1-4976-8ddd-2b13af0c158b both 11 823 1567 52 875
538655 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 24 July 2017 ur68483081 1 before we talk about the film let me tell you ... 10 dunkirk met a genuine dunkirk veteran at my sc... ec4ac50a-8d4b-4076-9fa1-739a590f795d both 10 587 1567 52 639
538656 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 21 July 2017 ur37909248 1 from the beginning til the end i held my breat... 10 i had high expectation and the movie delivered. ee91c48d-7f2e-4acc-be47-e3811b109c12 both 9 96 1567 52 148
538657 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 20 July 2017 ur69070086 1 one of the best war movies i have ever watched... 10 the battle of dunkirk 7c71c712-e04b-47e3-a8cc-b4632574d368 both 4 94 1567 52 146

143055 rows × 20 columns

In [ ]:
movie_reviews[(movie_reviews['sum num tokens review_text plus plot summary'] < 512) & (movie_reviews['is_spoiler'] == 1)]
Out[ ]:
movie_id plot_summary duration genre rating_x release_date plot_synopsis review_date user_id is_spoiler review_text rating_y review_summary review_id _merge review_summary_num_tokens review_text_num_tokens plot_synopsis_num_tokens plot_summary_num_tokens sum num tokens review_text plus plot summary
1 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 15 October 2007 ur13887584 1 the second in what looks like becoming the 'ja... 7 jack ryan chapter ii e04d4f23-89b5-4715-add8-1ee37d46dd99 both 4 129 1797 217 346
4 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 29 February 2004 ur2567338 1 this was one of the big summer movies of 1992.... 7 harrison ford in another tom clancy adaptation 44c6a555-674d-4f1c-a9fa-88b35b82c8ba both 7 236 1797 217 453
5 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 7 June 2002 ur0002746 1 this is a sorry excuse of a movie. i do not kn... 4 director fell asleep during movie. 6f7c7135-e16c-4fc9-9f89-1a1bac6b20c3 both 6 270 1797 217 487
8 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 7 August 2016 ur45215448 1 when cia analyst jack ryan interferes with an ... 9 the best of the ryan adaptations so far!!! 4a4c44f2-dc3b-41ac-a663-a1156bf1f940 both 11 143 1797 217 360
11 tt0105112 former cia analyst, jack ryan is in england wi... 1h 57min ['Action', 'Thriller'] 6.9 1992-06-05 jack ryan ford is on a working vacation in lon... 9 March 2008 ur12816955 1 harrison ford is dry and stiff as an actor in ... 2 what a bad film e5b72fb2-b032-41e1-a3ee-0d6830a2b60f both 4 251 1797 217 468
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
538646 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 30 July 2017 ur31231231 1 this film was filmed amazingly, but that is wh... 4 a mishmash of a film, disappointing 571b4560-5e67-4613-b56e-8ac90966adad both 9 328 1567 52 380
538647 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 27 July 2017 ur0937743 1 dunkirk 2017 12 fionn whitehead, aneurin barna... 7 shrugworthy, meh nolan's first big disappointi... 8e015f1e-164e-49f9-afb0-8486e529fadd both 14 273 1567 52 325
538648 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 23 July 2017 ur37359896 1 a lot of good movies have come in war genre an... 8 people say this movie lack characters and stor... 46982c0a-7318-4612-9559-124a8b5aad2c both 17 318 1567 52 370
538656 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 21 July 2017 ur37909248 1 from the beginning til the end i held my breat... 10 i had high expectation and the movie delivered. ee91c48d-7f2e-4acc-be47-e3811b109c12 both 9 96 1567 52 148
538657 tt5013056 evacuation of allied soldiers from the british... 1h 46min ['Action', 'Drama', 'History'] 8.1 2017-07-21 the film alternates between three different pe... 20 July 2017 ur69070086 1 one of the best war movies i have ever watched... 10 the battle of dunkirk 7c71c712-e04b-47e3-a8cc-b4632574d368 both 4 94 1567 52 146

83879 rows × 20 columns

In [ ]:
83879 / 143055
Out[ ]:
0.5863409178288072

58% of spoilers have the length pf review_text + plot_summary less than 512.

In [ ]:
movie_reviews.count()
Out[ ]:
movie_id                     538828
plot_summary                 538828
duration                     538828
genre                        538828
rating_x                     538828
release_date                 538828
plot_synopsis                538828
review_date                  538828
user_id                      538828
is_spoiler                   538828
review_text                  538828
rating_y                     538828
review_summary               538729
review_id                    538828
_merge                       538828
review_summary_num_tokens    538828
review_text_num_tokens       538828
plot_synopsis_num_tokens     538828
plot_summary_num_tokens      538828
dtype: int64
In [ ]:
movie_reviews.count()
Out[ ]:
movie_id                     535083
plot_summary                 535083
duration                     535083
genre                        535083
rating_x                     535083
release_date                 535083
plot_synopsis                535083
review_date                  535083
user_id                      535083
is_spoiler                   535083
review_text                  535083
rating_y                     535083
review_summary               534986
review_id                    535083
_merge                       535083
review_summary_num_tokens    535083
review_text_num_tokens       535083
plot_synopsis_num_tokens     535083
plot_summary_num_tokens      535083
dtype: int64
In [ ]:
movie_reviews['is_spoiler'].value_counts()
Out[ ]:
is_spoiler
False    392028
True     143055
Name: count, dtype: int64
In [ ]:
import numpy as np
from sklearn.model_selection import train_test_split

X = movie_reviews[['review_text', 'plot_summary']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
<ipython-input-9-1f8781bb0975>:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['label'] = movie_reviews['is_spoiler'].astype(int)

Bidirectional LSTM

We can try to train the bidirectional LSTM for

In [ ]:
import numpy as np
from sklearn.model_selection import train_test_split

X = movie_reviews[['review_text']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
<ipython-input-62-ca1b24491510>:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['label'] = movie_reviews['is_spoiler'].astype(int)
In [ ]:
# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]

# Number of samples to match
num_negative = len(negative_x)

print(num_negative)
print(len(positive_x))
392028
143055
In [ ]:
# Upsample negative samples
positive_x = positive_x.sample(n=10000, replace=False, random_state=42)
negative_x = negative_x.sample(n=10000, replace=False, random_state=42)


# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])

# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
In [ ]:
!pip install nltk
!pip install spacy
!python -m spacy download en_core_web_sm
Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (3.8.1)
Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk) (8.1.7)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk) (1.4.2)
Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk) (2024.5.15)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from nltk) (4.66.4)
Requirement already satisfied: spacy in /usr/local/lib/python3.10/dist-packages (3.7.5)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (8.2.5)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.0.10)
Requirement already satisfied: weasel<0.5.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (0.4.1)
Requirement already satisfied: typer<1.0.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (0.12.3)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (4.66.4)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.32.3)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.8.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.1.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy) (67.7.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (24.1)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.4.0)
Requirement already satisfied: numpy>=1.19.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.25.2)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.10/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy) (1.2.0)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (2.20.1)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2024.7.4)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.1.5)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (8.1.7)
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (1.5.4)
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (13.7.1)
Requirement already satisfied: cloudpathlib<1.0.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy) (0.18.1)
Requirement already satisfied: smart-open<8.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy) (7.0.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy) (2.1.5)
Requirement already satisfied: marisa-trie>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy) (1.2.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (2.16.1)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from smart-open<8.0.0,>=5.2.1->weasel<0.5.0,>=0.1.0->spacy) (1.14.1)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (0.1.2)
Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.8/12.8 MB 92.5 MB/s eta 0:00:00
Requirement already satisfied: spacy<3.8.0,>=3.7.2 in /usr/local/lib/python3.10/dist-packages (from en-core-web-sm==3.7.1) (3.7.5)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.2.5)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.10)
Requirement already satisfied: weasel<0.5.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.4.1)
Requirement already satisfied: typer<1.0.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.12.3)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.66.4)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.32.3)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.8.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.1.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (67.7.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (24.1)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.4.0)
Requirement already satisfied: numpy>=1.19.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.25.2)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.10/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.2.0)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.20.1)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2024.7.4)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.1.5)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.1.7)
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.5.4)
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (13.7.1)
Requirement already satisfied: cloudpathlib<1.0.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.18.1)
Requirement already satisfied: smart-open<8.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (7.0.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.1.5)
Requirement already satisfied: marisa-trie>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.2.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.16.1)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from smart-open<8.0.0,>=5.2.1->weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.14.1)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.1.2)
✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
⚠ Restart to reload dependencies
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
In [ ]:
import nltk
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')
nltk.download('averaged_perceptron_tagger') # Download the missing tagger model


# Function to convert nltk POS tags to wordnet POS tags
def get_wordnet_pos(word):
    tag = nltk.pos_tag([word])[0][1][0].upper()
    tag_dict = {"J": wordnet.ADJ,
                "N": wordnet.NOUN,
                "V": wordnet.VERB,
                "R": wordnet.ADV}
    return tag_dict.get(tag, wordnet.NOUN)

# Lemmatize function
def lemmatize_text(text):
    lemmatizer = WordNetLemmatizer()
    words = word_tokenize(text)
    lemmatized_words = [lemmatizer.lemmatize(word, get_wordnet_pos(word)) for word in words]
    return ' '.join(lemmatized_words)
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
In [ ]:
balanced_x['review_text'] = balanced_x['review_text'].map(lambda x: lemmatize_text(x))
In [ ]:
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
from gensim.models import Word2Vec
import nltk
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
import spacy
In [ ]:
# Assuming you have lemmatized_texts from previous step
# Tokenize and pad sequences as before
tokenizer = Tokenizer()
tokenizer.fit_on_texts(balanced_x['review_text'])
sequences = tokenizer.texts_to_sequences(balanced_x['review_text'])
word_index = tokenizer.word_index

# Pad sequences
max_length = max(len(seq) for seq in sequences)
data = pad_sequences(sequences, maxlen=max_length)

# Split data into training and test sets
X_train_temp, X_test, y_train_temp, y_test = train_test_split(data, balanced_x['label'], test_size=0.1, random_state=42, stratify=balanced_x['label'])

# Further split training data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train_temp, y_train_temp, test_size=0.1, random_state=42, stratify=y_train_temp)
In [ ]:
# As before, using lemmatized sentences
sentences = [text.split() for text in balanced_x['review_text']]
word2vec_model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)

# Create embedding matrix
embedding_dim = 100  # This should match the vector_size of Word2Vec model
embedding_matrix = np.zeros((len(word_index) + 1, embedding_dim))

for word, i in word_index.items():
    if word in word2vec_model.wv:
        embedding_matrix[i] = word2vec_model.wv[word]
In [ ]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
from tensorflow.keras.callbacks import EarlyStopping

model = Sequential()
model.add(Embedding(input_dim=len(word_index) + 1,
                    output_dim=embedding_dim,
                    weights=[embedding_matrix],
                    input_length=max_length,
                    trainable=False))
model.add(Bidirectional(LSTM(64)))
model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Early stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)


# Train the model
history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stopping])
Epoch 1/100
507/507 [==============================] - 53s 97ms/step - loss: 0.6934 - accuracy: 0.5193 - val_loss: 0.6898 - val_accuracy: 0.5278
Epoch 2/100
507/507 [==============================] - 49s 96ms/step - loss: 0.6642 - accuracy: 0.6007 - val_loss: 0.6567 - val_accuracy: 0.6300
Epoch 3/100
507/507 [==============================] - 49s 96ms/step - loss: 0.6240 - accuracy: 0.6570 - val_loss: 0.6363 - val_accuracy: 0.6372
Epoch 4/100
507/507 [==============================] - 49s 96ms/step - loss: 0.5978 - accuracy: 0.6834 - val_loss: 0.6281 - val_accuracy: 0.6378
Epoch 5/100
507/507 [==============================] - 49s 96ms/step - loss: 0.5755 - accuracy: 0.7009 - val_loss: 0.6268 - val_accuracy: 0.6544
Epoch 6/100
507/507 [==============================] - 49s 96ms/step - loss: 0.5507 - accuracy: 0.7194 - val_loss: 0.6283 - val_accuracy: 0.6528
Epoch 7/100
507/507 [==============================] - 49s 97ms/step - loss: 0.5245 - accuracy: 0.7417 - val_loss: 0.6355 - val_accuracy: 0.6567
Epoch 8/100
507/507 [==============================] - 49s 96ms/step - loss: 0.4886 - accuracy: 0.7656 - val_loss: 0.6512 - val_accuracy: 0.6439
In [ ]:
plt.figure(figsize=(10, 5))
plt.plot(range(1, 8 + 1), history.history['loss'], label='Train Loss')
plt.plot(range(1, 8 + 1), history.history['val_loss'], label='Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Losses')
plt.legend()
plt.grid(True)
plt.show()
No description has been provided for this image

Evaluating BiLSTM and confusion matrix

In [ ]:
# Make predictions
y_pred_probs = model.predict(X_test)
y_pred = (y_pred_probs > 0.5).astype("int32")

# Generate confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Negative', 'Positive'], yticklabels=['Negative', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()

# Print classification report
print(classification_report(y_test, y_pred, target_names=['Negative', 'Positive']))
63/63 [==============================] - 3s 43ms/step
No description has been provided for this image
              precision    recall  f1-score   support

    Negative       0.66      0.60      0.63      1000
    Positive       0.64      0.69      0.66      1000

    accuracy                           0.65      2000
   macro avg       0.65      0.65      0.65      2000
weighted avg       0.65      0.65      0.65      2000

Calibration plot of BiLSTM

In [ ]:
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np

# Assuming `model` is your trained BiLSTM model and `X_val` is your validation data
# Get predicted probabilities from BiLSTM
probs = model.predict(X_val)

# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(probs, (-1,))

# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), y_val)

# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
57/57 [==============================] - 3s 43ms/step
In [ ]:
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
In [ ]:
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve

# Plot calibration curve
prob_true, prob_pred = calibration_curve(y_val, probs[:, 1], n_bins=10)

plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='BiLSTM')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
No description has been provided for this image

The calibration plot follows the 45 degree line closely showing that the model is reliable.

Transformer based model

Here, we use pre-trained transformer based models. We sample an equal number of positive and negative examples from th e full dataset for the experiments here, as utilizing the full dataset was not practical , given the resouces available.

In [ ]:
import numpy as np
from sklearn.model_selection import train_test_split

X = movie_reviews[['review_text', 'plot_summary']]
X['label'] = movie_reviews['is_spoiler'].astype(int)

# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]

# Upsample negative samples
positive_x = positive_x.sample(n=5000, replace=False, random_state=42)
negative_x = negative_x.sample(n=5000, replace=False, random_state=42)


# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])

# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
<ipython-input-14-f64f9dfaa09f>:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['label'] = movie_reviews['is_spoiler'].astype(int)

Using pre-trained bert-base-uncased BertForSequenceClassification model

In [ ]:
max_length = 512
In [ ]:
!pip install transformers torch peft
Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.42.4)
Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (2.3.1+cu121)
Requirement already satisfied: peft in /usr/local/lib/python3.10/dist-packages (0.11.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.15.4)
Requirement already satisfied: huggingface-hub<1.0,>=0.23.2 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.23.5)
Requirement already satisfied: numpy<2.0,>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (24.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2024.5.15)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.32.3)
Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.4.3)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.19.1)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.66.4)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages (from torch) (4.12.2)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch) (1.13.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.4)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch) (2023.6.0)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /usr/local/lib/python3.10/dist-packages (from torch) (8.9.2.26)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /usr/local/lib/python3.10/dist-packages (from torch) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /usr/local/lib/python3.10/dist-packages (from torch) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /usr/local/lib/python3.10/dist-packages (from torch) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.20.5 in /usr/local/lib/python3.10/dist-packages (from torch) (2.20.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105)
Requirement already satisfied: triton==2.3.1 in /usr/local/lib/python3.10/dist-packages (from torch) (2.3.1)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /usr/local/lib/python3.10/dist-packages (from nvidia-cusolver-cu12==11.4.5.107->torch) (12.5.82)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from peft) (5.9.5)
Requirement already satisfied: accelerate>=0.21.0 in /usr/local/lib/python3.10/dist-packages (from peft) (0.32.1)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2024.7.4)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch) (1.3.0)
In [ ]:
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('bert-base-uncased',
                                    num_labels=2,
                                    max_position_embeddings=max_length
                                    )

# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')

with open(config_path, 'w') as f:
    json.dump(config.to_dict(), f)

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased', config=config)

# Load pre-trained model with custom configuration
model =AutoModelForSequenceClassification.from_pretrained(
    'bert-base-uncased',
    config=config_path,  # Use the path to the configuration file
    cache_dir= prefix_path + '/cache',  # Cache directory
    force_download=False,  # Force download the model weights
    resume_download=False,  # Resume download if it was previously interrupted
    proxies=None,  # Proxy servers
    output_loading_info=False,  # Do not return loading info
    local_files_only=False,  # Look for local files
    use_auth_token=False,  # Use authentication token if needed
    revision='main',  # Specific model version
    mirror=None,  # Mirror repository URL
    #gradient_checkpointing=True,
    ignore_mismatched_sizes=True
)

from peft import LoraConfig, get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,  # The rank of the LoRA matrix
    lora_alpha=32,  # Scaling factor
    target_modules=["query", "value"],  # Target modules for LoRA
    lora_dropout=0.1,  # Dropout rate
    bias="none"  # Bias handling
)

# Apply LoRA to the model
model = get_peft_model(model, lora_config)
#model = BertForSequenceClassification.from_pretrained('bert-base-uncased', config=config, ignore_mismatched_sizes=True)

# Enable gradient checkpointing
#model.gradient_checkpointing_enable()
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: 
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:468: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
In [ ]:
from sklearn.model_selection import train_test_split

# Split data into train, validation, and test sets
train_df, temp_df = train_test_split(balanced_x, test_size=0.2, random_state=42, stratify=balanced_x['label'])
eval_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42, stratify=temp_df['label'])
In [ ]:
import pandas as pd
import torch
from torch.utils.data import DataLoader, Dataset
from sklearn.model_selection import train_test_split
from transformers import RobertaTokenizer, RobertaForSequenceClassification, RobertaConfig, Trainer, TrainingArguments, default_data_collator
from sklearn.metrics import accuracy_score
In [ ]:
class SentencePairDataset(Dataset):
    def __init__(self, dataframe, tokenizer, max_length):
        self.dataframe = dataframe
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        sentence1 = self.dataframe.iloc[idx, 0]
        sentence2 = self.dataframe.iloc[idx, 1]
        label = self.dataframe.iloc[idx, 2]
        inputs = self.tokenizer.encode_plus(
            sentence1,
            sentence2,
            add_special_tokens=True,
            max_length=self.max_length,
            truncation=True,
            padding="max_length",
            return_tensors="pt",
            return_attention_mask=True,
            return_token_type_ids=True,
            return_overflowing_tokens=False,
            return_length=True,
            pad_to_max_length=True
        )
        inputs["labels"] = torch.tensor(label, dtype=torch.long)
        return {key: val.squeeze(0) for key, val in inputs.items()}
In [ ]:
# Create dataset objects
train_dataset = SentencePairDataset(train_df, tokenizer, max_length)
eval_dataset = SentencePairDataset(eval_df, tokenizer, max_length)
test_dataset = SentencePairDataset(test_df, tokenizer, max_length)
In [ ]:
train_dataset.__getitem__(0)
Out[ ]:
{'input_ids': tensor([  101,  2748,  2023,  2003,  1037, 10225, 25318,  3185, 10303,  2045,
          2003,  2070, 11749,  5171, 10225, 25318, 13764,  8649,  1998,  1037,
          8552,  1998, 13749, 18595,  3560,  2466,  1012,  2021,  2242,  2003,
          4394,  2182,  1010,  2926,  4102,  2000, 10225, 25318,  1005,  1055,
          2010,  2060,  2147,  1012,  2364,  3291,  2003,  1045,  2228,  1996,
          3494,  1012,  2027,  2024,  2204,  2035,  1998, 18080,  1999,  1037,
          2126,  2021,  2027,  2024,  2025,  2428, 13432,  1012,  2926,  1996,
          2364,  2839,  9901,  2829, 14089, 24665,  3771,  2003, 15640,  1012,
          2016,  2003,  2025,  2428,  2844,  2438,  2004,  1037,  2839,  2000,
          4287,  2023,  2972,  3185,  1012,  2613,  2364,  2839,  1997,  1996,
          3185,  2000,  2033,  2001,  5212,  1048,  1012,  4027,  2021,  2339,
          2821,  2339,  2106,  2017,  2018,  2000,  4929,  2008,  5976,  2559,
         18865,  3520,  1029,  2839,  2008,  1045,  5621,  4669,  2001,  2728,
          2139,  9152,  3217,  1005,  1055,  1012,  2002,  3248,  1037,  4066,
          1997, 10916,  2535,  2066,  2017,  2052,  2025,  5987,  2013,  2032,
          1999,  1037,  3185,  2066,  2023,  1012,  1996,  2717,  1997,  1996,
          2839,  2024,  2025,  2428,  2844,  2030,  5875,  2438,  2000,  4287,
          2023,  2972,  3185,  1012,  2428, 10473,  5783,  1997,  1996,  3185,
          2003,  1996,  8235,  2135,  2517,  5896,  2008,  2320,  2153,  2003,
          3561,  2007,  2070,  5171, 10225, 25318,  3787,  1998, 21438,  1012,
          2748,  1010,  1996,  4566,  2453,  2018,  2042,  1037,  2978,  5410,
          1998,  3383, 15640,  2021,  2053,  2009,  2106,  2025,  9868,  1996,
          3185,  2005,  2033,  1012,  1045,  2145,  5632,  2023,  3185,  1010,
          2009,  2074,  2347,  1005,  1056,  2004,  2204,  2004,  1045,  2018,
          5113,  2005,  1012,  2009,  2001,  2785,  1997,  2066,  1996,  3185,
          2041,  1997,  4356,  2011,  7112,  2061,  4063,  4059,  2232,  2029,
          2038,  1037,  2785,  1997,  2714,  3857, 22264, 27983,  1998,  2466,
          1010,  2009,  3310,  2200,  2485,  2012,  2335,  2021,  2009,  2196,
          3243,  3138,  1996,  9850,  1012,  2145,  1037,  6669,  3422,  3085,
          1010,  3262,  4569,  3185,  2021,  2145,  2017,  2024,  2488,  1997,
          3666, 10225, 25318,  1005,  1055,  2060,  2147,  1012, 27671,   102,
          1996,  2690, 18655, 17946,  7971,  9901,  2829, 20673, 17125,  2769,
          2013,  3290,  2000,  3050,  3349,  2005,  1996,  2608, 11033,  2030,
         12662, 12289,  1012,  2043,  2016,  4152,  3236,  2011,  1996,  6074,
          4097,  9851,  2102,  1998,  2928, 18243, 12349,  2007,  2702,  4595,
          6363,  1998, 16034,  1999,  2014,  8722,  1010,  2027, 16599,  1037,
          3066,  2000,  2014,  2000,  2393,  2068,  2000,  6545,  2030, 12662,
          1999,  3863,  1997,  2014,  4071,  1012,  5564,  2030, 12662,  5176,
          1996,  5179, 29100, 11614,  4098,  9115,  1010,  2040,  3216,  1037,
         15358,  5416,  2449,  1010,  2000,  2713,  9901,  2829,  2007,  1996,
          6808,  1997, 15349,  2014,  1012,  9901, 13172,  1997,  2030, 12662,
          1005,  1055,  6808,  1998, 14811,  1037,  8552,  7023,  2208,  2007,
          4098,  2000,  8954,  2431,  1037,  2454,  6363,  2013,  2030, 12662,
          1012,  2517,  2011, 19569,  2482, 10175,  6806,  1010,  5673,  2139,
         11497,  1010,  4380,   102,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0]),
 'token_type_ids': tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0]),
 'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0]),
 'length': tensor(512),
 'labels': tensor(0)}
In [ ]:
# DataLoader
train_loader = DataLoader(train_dataset, batch_size=1, shuffle=True, collate_fn=default_data_collator)
val_loader = DataLoader(eval_dataset, batch_size=1, collate_fn=default_data_collator)
test_loader = DataLoader(test_dataset, batch_size=1, collate_fn=default_data_collator)
In [ ]:
import torch

# Check if CUDA is available
if torch.cuda.is_available():
    device_id = torch.cuda.current_device()  # Get the current device ID
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats(device_id)  # Pass the device ID
    print(torch.cuda.memory_summary(device=None, abbreviated=False))
else:
    print("CUDA is not available. Please check your installation.")
|===========================================================================|
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|===========================================================================|
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Active memory         |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Requested memory      |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| GPU reserved memory   |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Non-releasable memory |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Allocations           |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Active allocs         |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| GPU reserved segments |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Non-releasable allocs |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize allocations  |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize GPU segments |       0    |       0    |       0    |       0    |
|===========================================================================|

In [ ]:
from transformers import EarlyStoppingCallback
# Early stopping callback
early_stopping = EarlyStoppingCallback(early_stopping_patience=2, early_stopping_threshold=0.0,)
In [ ]:
import numpy as np
from datasets import load_metric

def compute_metrics(p):
    accuracy_metric = load_metric("accuracy")
    predictions = np.argmax(p.predictions, axis=1)
    accuracy = accuracy_metric.compute(predictions=predictions, references=p.label_ids)
    return accuracy
In [ ]:
# Training arguments
training_args = TrainingArguments(
    output_dir= prefix_path + '/results',
    num_train_epochs=10,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir= prefix_path + '/logs',
    logging_steps=10,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model='eval_loss',
    greater_is_better=False,
    gradient_accumulation_steps=12,
    fp16=True,
    label_names=["labels"]
)
/usr/local/lib/python3.10/dist-packages/transformers/training_args.py:1494: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead
  warnings.warn(
In [ ]:
# Custom Trainer class
class CustomTrainer(Trainer):
    def __init__(
        self, # Add self
        model,
        args, # Change model to args
        train_dataset=None,
        eval_dataset=None,
        data_collator=None,
        compute_metrics=None,
        tokenizer=None, # Pass the tokenizer here
        callbacks=None
    ):
        super().__init__(
            model=model,
            args=training_args,
            train_dataset=train_dataset,
            eval_dataset=eval_dataset,
            data_collator=default_data_collator,
            compute_metrics=compute_metrics,
            tokenizer=tokenizer, # Pass the tokenizer here
            callbacks=[early_stopping]
        )

    # Check your training data
    #print(train_dataset[0])  # Print the first example in your training dataset
    #print(train_dataset.column_names) # Print the column names of your training dataset

    def save_model(self, output_dir=None, _internal_call=False):
        # Only save the model and tokenizer, not the optimizer state
        if output_dir is None:
            output_dir = self.args.output_dir

        self.model.save_pretrained(output_dir)
        # Convert BertConfig to a JSON-serializable dictionary
        #tokenizer_config_dict = self.tokenizer.config.to_dict()
        #self.tokenizer.save_pretrained(output_dir, config=tokenizer_config_dict) # Pass the dictionary here
        self._rotate_checkpoints(use_mtime=True)
In [ ]:
# Trainer
trainer = CustomTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    data_collator=default_data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer, # Pass the tokenizer here
    callbacks=[early_stopping]
)
In [ ]:
# Train the model
trainer.train()

# Save the model, tokenizer, and trainer state
model.save_pretrained(prefix_path + '/results')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results')  # Save the trained model
trainer.save_state()

# Save the model and tokenizer
model_save_path = prefix_path + '/results' + "/model"
tokenizer_save_path = prefix_path + '/results' + "./tokenizer"
trainer.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)

# Save the trainer state
trainer.state.save_to_json( prefix_path + '/results' + "/trainer_state.json")
[4473/6660 41:01 < 20:04, 1.82 it/s, Epoch 6.71/10]
Epoch Training Loss Validation Loss Accuracy
0 0.676600 0.645086 0.635000
1 0.569500 0.608502 0.661000
3 0.580300 0.580992 0.692000
4 0.561200 0.574466 0.698000
6 0.580600 0.571313 0.699000

/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
[4555/6660 41:44 < 19:17, 1.82 it/s, Epoch 6.83/10]
Epoch Training Loss Validation Loss Accuracy
0 0.676600 0.645086 0.635000
1 0.569500 0.608502 0.661000
3 0.580300 0.580992 0.692000
4 0.561200 0.574466 0.698000
6 0.580600 0.571313 0.699000

Evaluation on eval dataset and plot of training and validation loss over epochs

In [ ]:
import matplotlib.pyplot as plt

# Initialize lists to hold the extracted values
val_losses = []
training_losses = []
epochs = []

# Track the last training loss before each eval_loss entry
last_training_loss = None

for entry in trainer.state.log_history:
    # Extract training loss
    # Update the last seen training loss
    if 'loss' in entry:
        last_training_loss = entry['loss']

    # Extract eval_loss and corresponding epoch
    if 'eval_loss' in entry:
        val_losses.append(entry['eval_loss'])
        training_losses.append(last_training_loss)
        epochs.append(entry['epoch'])

# Plot training and validation loss
plt.plot(epochs, training_losses, label='Training Loss')
plt.plot(epochs, val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
No description has been provided for this image
In [ ]:
# Evaluate the model on evaluation dataset
eval_results = trainer.evaluate()
# Print results
print(eval_results)
[1000/1000 00:20]
{'eval_loss': 0.5713134407997131, 'eval_accuracy': 0.699, 'eval_runtime': 21.0249, 'eval_samples_per_second': 47.563, 'eval_steps_per_second': 47.563, 'epoch': 7.9995}

Evaluation of transformer based model using test data, confusion matrix and classification report

In [ ]:
y_pred_probs  = trainer.predict(test_dataset)
In [ ]:
y_pred_probs.predictions.shape
Out[ ]:
(1000, 2)
In [ ]:
# Extract predicted labels
preds = np.argmax(y_pred_probs.predictions, axis=1)

# Compute confusion matrix
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

cm = confusion_matrix(test_df['label'], preds)

# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()


# Print evaluation results
print(f"Evaluation results: {y_pred_probs.metrics}")
No description has been provided for this image
Evaluation results: {'test_loss': 0.5969365835189819, 'test_accuracy': 0.681, 'test_runtime': 22.5106, 'test_samples_per_second': 44.424, 'test_steps_per_second': 44.424}
In [ ]:
from sklearn.metrics import classification_report # Import the missing function

# Print classification report
print(classification_report(test_df['label'], preds, target_names=['Negative', 'Positive']))
              precision    recall  f1-score   support

    Negative       0.67      0.72      0.69       500
    Positive       0.69      0.65      0.67       500

    accuracy                           0.68      1000
   macro avg       0.68      0.68      0.68      1000
weighted avg       0.68      0.68      0.68      1000

Calibration report over eval dataset

In [ ]:
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np

pred_eval = trainer.predict(eval_dataset)

preds = np.argmax(pred_eval.predictions, axis=1)

# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))

# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), pred_eval.label_ids)

# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
In [ ]:
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
In [ ]:
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve

# Plot calibration curve
prob_true, prob_pred = calibration_curve(pred_eval.label_ids, probs[:, 1], n_bins=10)

plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
No description has been provided for this image

Using pre-trained allenai/longformer-base-4096 LongformerForSequenceClassification model

In [ ]:
max_length = 4096
In [ ]:
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('allenai/longformer-base-4096',
                                    num_labels=2,
                                    max_position_embeddings=max_length
                                    )

# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')

with open(config_path, 'w') as f:
    json.dump(config.to_dict(), f)

tokenizer = AutoTokenizer.from_pretrained('allenai/longformer-base-4096', config=config)

# Load pre-trained model with custom configuration
model =AutoModelForSequenceClassification.from_pretrained(
    'allenai/longformer-base-4096',
    config=config_path,  # Use the path to the configuration file
    cache_dir= prefix_path + '/cache',  # Cache directory
    force_download=False,  # Force download the model weights
    resume_download=False,  # Resume download if it was previously interrupted
    proxies=None,  # Proxy servers
    output_loading_info=False,  # Do not return loading info
    local_files_only=False,  # Look for local files
    use_auth_token=False,  # Use authentication token if needed
    revision='main',  # Specific model version
    mirror=None,  # Mirror repository URL
    #gradient_checkpointing=True,
    ignore_mismatched_sizes=True
)

from peft import LoraConfig, get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,  # The rank of the LoRA matrix
    lora_alpha=32,  # Scaling factor
    target_modules=["query", "value"],  # Target modules for LoRA
    lora_dropout=0.1,  # Dropout rate
    bias="none"  # Bias handling
)

# Apply LoRA to the model
model = get_peft_model(model, lora_config)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
config.json:   0%|          | 0.00/694 [00:00<?, ?B/s]
vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]
merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]
tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]
/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:468: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
Some weights of LongformerForSequenceClassification were not initialized from the model checkpoint at allenai/longformer-base-4096 and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
In [ ]:
# Create dataset objects
train_dataset = SentencePairDataset(train_df, tokenizer, max_length)
eval_dataset = SentencePairDataset(eval_df, tokenizer, max_length)
test_dataset = SentencePairDataset(test_df, tokenizer, max_length)
In [ ]:
train_dataset.__getitem__(0)
Out[ ]:
{'input_ids': tensor([    0, 10932,    42,  ...,     1,     1,     1]),
 'token_type_ids': tensor([0, 0, 0,  ..., 0, 0, 0]),
 'attention_mask': tensor([1, 1, 1,  ..., 0, 0, 0]),
 'length': tensor(4096),
 'labels': tensor(0)}
In [ ]:
# DataLoader
train_loader = DataLoader(train_dataset, batch_size=1, shuffle=True, collate_fn=default_data_collator)
val_loader = DataLoader(eval_dataset, batch_size=1, collate_fn=default_data_collator)
test_loader = DataLoader(test_dataset, batch_size=1, collate_fn=default_data_collator)
In [ ]:
import torch

# Check if CUDA is available
if torch.cuda.is_available():
    device_id = torch.cuda.current_device()  # Get the current device ID
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats(device_id)  # Pass the device ID
    print(torch.cuda.memory_summary(device=None, abbreviated=False))
else:
    print("CUDA is not available. Please check your installation.")
|===========================================================================|
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|===========================================================================|
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      | 450034 KiB | 450034 KiB |  88270 GiB |  88269 GiB |
|       from large pool | 444928 KiB | 444928 KiB |  54891 GiB |  54891 GiB |
|       from small pool |   5106 KiB |   5106 KiB |  33378 GiB |  33378 GiB |
|---------------------------------------------------------------------------|
| Active memory         | 450034 KiB | 450034 KiB |  88270 GiB |  88269 GiB |
|       from large pool | 444928 KiB | 444928 KiB |  54891 GiB |  54891 GiB |
|       from small pool |   5106 KiB |   5106 KiB |  33378 GiB |  33378 GiB |
|---------------------------------------------------------------------------|
| Requested memory      | 448927 KiB | 448927 KiB |  81565 GiB |  81565 GiB |
|       from large pool | 443822 KiB | 443822 KiB |  48188 GiB |  48188 GiB |
|       from small pool |   5105 KiB |   5105 KiB |  33377 GiB |  33377 GiB |
|---------------------------------------------------------------------------|
| GPU reserved memory   | 528384 KiB | 528384 KiB | 254702 MiB | 254186 MiB |
|       from large pool | 501760 KiB | 501760 KiB |    930 MiB |    440 MiB |
|       from small pool |  26624 KiB |  26624 KiB | 253772 MiB | 253746 MiB |
|---------------------------------------------------------------------------|
| Non-releasable memory |  78349 KiB |  78349 KiB | 105710 GiB | 105710 GiB |
|       from large pool |  56832 KiB |  56832 KiB |  68984 GiB |  68984 GiB |
|       from small pool |  21517 KiB |  21517 KiB |  36725 GiB |  36725 GiB |
|---------------------------------------------------------------------------|
| Allocations           |     399    |     399    |  102178 K  |  102178 K  |
|       from large pool |      77    |      77    |   25952 K  |   25952 K  |
|       from small pool |     322    |     322    |   76226 K  |   76226 K  |
|---------------------------------------------------------------------------|
| Active allocs         |     399    |     399    |  102178 K  |  102178 K  |
|       from large pool |      77    |      77    |   25952 K  |   25952 K  |
|       from small pool |     322    |     322    |   76226 K  |   76226 K  |
|---------------------------------------------------------------------------|
| GPU reserved segments |      34    |      34    |  126929    |  126895    |
|       from large pool |      21    |      21    |      43    |      22    |
|       from small pool |      13    |      13    |  126886    |  126873    |
|---------------------------------------------------------------------------|
| Non-releasable allocs |      60    |      60    |   56747 K  |   56747 K  |
|       from large pool |      20    |      20    |   19373 K  |   19373 K  |
|       from small pool |      40    |      40    |   37374 K  |   37374 K  |
|---------------------------------------------------------------------------|
| Oversize allocations  |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize GPU segments |       0    |       0    |       0    |       0    |
|===========================================================================|

In [ ]:
# Trainer
trainer = CustomTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    data_collator=default_data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer, # Pass the tokenizer here
    callbacks=[early_stopping]
)
In [ ]:
# Train the model
trainer.train()

# Save the model, tokenizer, and trainer state
model.save_pretrained(prefix_path + '/results')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results')  # Save the trained model
trainer.save_state()

# Save the model and tokenizer
model_save_path = prefix_path + '/results' + "/model"
tokenizer_save_path = prefix_path + '/results' + "./tokenizer"
trainer.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)

# Save the trainer state
trainer.state.save_to_json( prefix_path + '/results' + "/trainer_state.json")
Initializing global attention on CLS token...
[6574/6660 19:19:20 < 15:10, 0.09 it/s, Epoch 9.86/10]
Epoch Training Loss Validation Loss Accuracy
0 0.660300 0.646325 0.632000
1 0.558600 0.586706 0.697000
3 0.582600 0.574838 0.705000
4 0.514200 0.594929 0.689000
6 0.615000 0.571699 0.708000
7 0.509200 0.565953 0.718000
9 0.526000 0.560785 0.721000

/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
[6660/6660 19:37:52, Epoch 9/10]
Epoch Training Loss Validation Loss Accuracy
0 0.660300 0.646325 0.632000
1 0.558600 0.586706 0.697000
3 0.582600 0.574838 0.705000
4 0.514200 0.594929 0.689000
6 0.615000 0.571699 0.708000
7 0.509200 0.565953 0.718000
9 0.591700 0.564030 0.721000

/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(

Evaluation of eval dataset and plot of training and validation loss

In [ ]:
import matplotlib.pyplot as plt

# Initialize lists to hold the extracted values
val_losses = []
training_losses = []
epochs = []

# Track the last training loss before each eval_loss entry
last_training_loss = None

for entry in trainer.state.log_history:
    # Extract training loss
    # Update the last seen training loss
    if 'loss' in entry:
        last_training_loss = entry['loss']

    # Extract eval_loss and corresponding epoch
    if 'eval_loss' in entry:
        val_losses.append(entry['eval_loss'])
        training_losses.append(last_training_loss)
        epochs.append(entry['epoch'])

# Plot training and validation loss
plt.plot(epochs, training_losses, label='Training Loss')
plt.plot(epochs, val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
No description has been provided for this image
In [ ]:
# Evaluate the model on evaluation dataset
eval_results = trainer.evaluate()
# Print results
print(eval_results)
[1000/1000 03:36]
{'eval_loss': 0.5607852935791016, 'eval_accuracy': 0.721, 'eval_runtime': 217.9064, 'eval_samples_per_second': 4.589, 'eval_steps_per_second': 4.589, 'epoch': 9.99}

Evaluation of longformer based model on test data, confusion matrix and classification report

In [ ]:
y_pred_probs  = trainer.predict(test_dataset)

# Extract predicted labels
preds = np.argmax(y_pred_probs.predictions, axis=1)

# Compute confusion matrix
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

cm = confusion_matrix(test_df['label'], preds)

# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()


# Print evaluation results
print(f"Evaluation results: {y_pred_probs.metrics}")
No description has been provided for this image
Evaluation results: {'test_loss': 0.5858533978462219, 'test_accuracy': 0.715, 'test_runtime': 217.8881, 'test_samples_per_second': 4.59, 'test_steps_per_second': 4.59}
In [ ]:
from sklearn.metrics import classification_report # Import the missing function

# Print classification report
print(classification_report(test_df['label'], preds, target_names=['Negative', 'Positive']))
              precision    recall  f1-score   support

    Negative       0.70      0.76      0.73       500
    Positive       0.74      0.67      0.70       500

    accuracy                           0.71      1000
   macro avg       0.72      0.72      0.71      1000
weighted avg       0.72      0.71      0.71      1000

Calibration report over eval_dataset

In [ ]:
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np

pred_eval = trainer.predict(eval_dataset)

preds = np.argmax(pred_eval.predictions, axis=1)

# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))

# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), eval_df['label'])

# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
In [ ]:
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
In [ ]:
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve

# Plot calibration curve
prob_true, prob_pred = calibration_curve(eval_df['label'], probs[:, 1], n_bins=10)

plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
No description has been provided for this image

Calibration by temperature scaling
In [ ]:
 
In [ ]:
import numpy as np

# Get predictions on the validation set
predictions = trainer.predict(encoded_dataset['validation'])
logits = predictions.predictions
labels = predictions.label_ids

# Convert logits to probabilities
probs = torch.nn.functional.softmax(torch.tensor(logits), dim=-1).numpy()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-e1a1d7068d18> in <cell line: 4>()
      2 
      3 # Get predictions on the validation set
----> 4 predictions = trainer.predict(encoded_dataset['validation'])
      5 logits = predictions.predictions
      6 labels = predictions.label_ids

NameError: name 'trainer' is not defined

Two tower (Ensemble for prediuction) BERT and BERT

In [ ]:
import numpy as np
from sklearn.model_selection import train_test_split

X = movie_reviews[['review_text', 'plot_synopsis']]
X['label'] = movie_reviews['is_spoiler'].astype(int)

# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]

# Upsample negative samples
positive_x = positive_x.sample(n=5000, replace=False, random_state=42)
negative_x = negative_x.sample(n=5000, replace=False, random_state=42)


# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])

# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
<ipython-input-56-9402b26fe34c>:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['label'] = movie_reviews['is_spoiler'].astype(int)
In [ ]:
from sklearn.model_selection import train_test_split

# Split data into train, validation, and test sets
train_df, temp_df = train_test_split(balanced_x, test_size=0.2, random_state=42, stratify=balanced_x['label'])
eval_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42, stratify=temp_df['label'])
In [ ]:
 
In [ ]:
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('bert-base-uncased',
                                    num_labels=2,
                                    max_position_embeddings=max_length
                                    )

# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')

with open(config_path, 'w') as f:
    json.dump(config.to_dict(), f)

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased', config=config)

# Load pre-trained model with custom configuration
model_premise =AutoModelForSequenceClassification.from_pretrained(
    'bert-base-uncased',
    config=config_path,  # Use the path to the configuration file
    cache_dir= prefix_path + '/premise_cache',  # Cache directory
    force_download=False,  # Force download the model weights
    resume_download=False,  # Resume download if it was previously interrupted
    proxies=None,  # Proxy servers
    output_loading_info=False,  # Do not return loading info
    local_files_only=False,  # Look for local files
    use_auth_token=False,  # Use authentication token if needed
    revision='main',  # Specific model version
    mirror=None,  # Mirror repository URL
    #gradient_checkpointing=True,
    ignore_mismatched_sizes=True
)

from peft import LoraConfig, get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,  # The rank of the LoRA matrix
    lora_alpha=32,  # Scaling factor
    target_modules=["query", "value"],  # Target modules for LoRA
    lora_dropout=0.1,  # Dropout rate
    bias="none"  # Bias handling
)

# Apply LoRA to the model
model_premise = get_peft_model(model_premise, lora_config)


model_hypothesis =AutoModelForSequenceClassification.from_pretrained(
    'bert-base-uncased',
    config=config_path,  # Use the path to the configuration file
    cache_dir= prefix_path + '/hypothesis-cache',  # Cache directory
    force_download=False,  # Force download the model weights
    resume_download=False,  # Resume download if it was previously interrupted
    proxies=None,  # Proxy servers
    output_loading_info=False,  # Do not return loading info
    local_files_only=False,  # Look for local files
    use_auth_token=False,  # Use authentication token if needed
    revision='main',  # Specific model version
    mirror=None,  # Mirror repository URL
    #gradient_checkpointing=True,
    ignore_mismatched_sizes=True
)

from peft import LoraConfig, get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,  # The rank of the LoRA matrix
    lora_alpha=32,  # Scaling factor
    target_modules=["query", "value"],  # Target modules for LoRA
    lora_dropout=0.1,  # Dropout rate
    bias="none"  # Bias handling
)

# Apply LoRA to the model
model_hypothesis = get_peft_model(model_hypothesis, lora_config)
/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]
model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
In [ ]:
class SentenceDataset(Dataset):
    def __init__(self, dataframe, tokenizer, max_length, is_hypothesis_index):
        self.dataframe = dataframe
        self.tokenizer = tokenizer
        self.max_length = max_length
        self.is_hypothesis_index = is_hypothesis_index

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        sentence = self.dataframe.iloc[idx, self.is_hypothesis_index]
        label = self.dataframe.iloc[idx, 2]
        inputs = self.tokenizer.encode_plus(
            sentence,
            add_special_tokens=True,
            max_length=self.max_length,
            truncation=True,
            padding="max_length",
            return_tensors="pt",
            return_attention_mask=True,
            return_token_type_ids=True,
            return_overflowing_tokens=False,
            return_length=True,
            pad_to_max_length=True
        )
        inputs["labels"] = torch.tensor(label, dtype=torch.long)
        return {key: val.squeeze(0) for key, val in inputs.items()}
In [ ]:
train_dataset_premise = SentenceDataset(train_df, tokenizer, max_length, 0)
train_dataset_hypothesis = SentenceDataset(train_df, tokenizer, max_length, 1)
eval_dataset_premise = SentenceDataset(eval_df, tokenizer, max_length, 0)
eval_dataset_hypothesis = SentenceDataset(eval_df, tokenizer, max_length, 1)
test_dataset_premise = SentenceDataset(test_df, tokenizer, max_length, 0)
test_dataset_hypothesis = SentenceDataset(test_df, tokenizer, max_length, 1)
In [ ]:
train_dataset_premise.__getitem__(0)
Out[ ]:
{'input_ids': tensor([  101,  2748,  2023,  2003,  1037, 10225, 25318,  3185, 10303,  2045,
          2003,  2070, 11749,  5171, 10225, 25318, 13764,  8649,  1998,  1037,
          8552,  1998, 13749, 18595,  3560,  2466,  1012,  2021,  2242,  2003,
          4394,  2182,  1010,  2926,  4102,  2000, 10225, 25318,  1005,  1055,
          2010,  2060,  2147,  1012,  2364,  3291,  2003,  1045,  2228,  1996,
          3494,  1012,  2027,  2024,  2204,  2035,  1998, 18080,  1999,  1037,
          2126,  2021,  2027,  2024,  2025,  2428, 13432,  1012,  2926,  1996,
          2364,  2839,  9901,  2829, 14089, 24665,  3771,  2003, 15640,  1012,
          2016,  2003,  2025,  2428,  2844,  2438,  2004,  1037,  2839,  2000,
          4287,  2023,  2972,  3185,  1012,  2613,  2364,  2839,  1997,  1996,
          3185,  2000,  2033,  2001,  5212,  1048,  1012,  4027,  2021,  2339,
          2821,  2339,  2106,  2017,  2018,  2000,  4929,  2008,  5976,  2559,
         18865,  3520,  1029,  2839,  2008,  1045,  5621,  4669,  2001,  2728,
          2139,  9152,  3217,  1005,  1055,  1012,  2002,  3248,  1037,  4066,
          1997, 10916,  2535,  2066,  2017,  2052,  2025,  5987,  2013,  2032,
          1999,  1037,  3185,  2066,  2023,  1012,  1996,  2717,  1997,  1996,
          2839,  2024,  2025,  2428,  2844,  2030,  5875,  2438,  2000,  4287,
          2023,  2972,  3185,  1012,  2428, 10473,  5783,  1997,  1996,  3185,
          2003,  1996,  8235,  2135,  2517,  5896,  2008,  2320,  2153,  2003,
          3561,  2007,  2070,  5171, 10225, 25318,  3787,  1998, 21438,  1012,
          2748,  1010,  1996,  4566,  2453,  2018,  2042,  1037,  2978,  5410,
          1998,  3383, 15640,  2021,  2053,  2009,  2106,  2025,  9868,  1996,
          3185,  2005,  2033,  1012,  1045,  2145,  5632,  2023,  3185,  1010,
          2009,  2074,  2347,  1005,  1056,  2004,  2204,  2004,  1045,  2018,
          5113,  2005,  1012,  2009,  2001,  2785,  1997,  2066,  1996,  3185,
          2041,  1997,  4356,  2011,  7112,  2061,  4063,  4059,  2232,  2029,
          2038,  1037,  2785,  1997,  2714,  3857, 22264, 27983,  1998,  2466,
          1010,  2009,  3310,  2200,  2485,  2012,  2335,  2021,  2009,  2196,
          3243,  3138,  1996,  9850,  1012,  2145,  1037,  6669,  3422,  3085,
          1010,  3262,  4569,  3185,  2021,  2145,  2017,  2024,  2488,  1997,
          3666, 10225, 25318,  1005,  1055,  2060,  2147,  1012, 27671,   102,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0]),
 'token_type_ids': tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0]),
 'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0]),
 'length': tensor(512),
 'labels': tensor(0)}
In [ ]:
# Trainer
trainer_premise = CustomTrainer(
    model=model_premise,
    args=training_args,
    train_dataset=train_dataset_premise,
    eval_dataset=eval_dataset_premise,
    data_collator=default_data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer, # Pass the tokenizer here
    callbacks=[early_stopping]
)
In [ ]:
# Train the model
trainer_premise.train()

# Save the model, tokenizer, and trainer state
model_premise.save_pretrained(prefix_path + '/results_premise')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results')  # Save the trained model
trainer_premise.save_state()

# Save the model and tokenizer
model_save_path = prefix_path + '/results_premise' + "/model"
tokenizer_save_path = prefix_path + '/results_premise' + "./tokenizer"
trainer_premise.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)

# Save the trainer state
trainer_premise.state.save_to_json( prefix_path + '/results_premise' + "/trainer_state.json")
[2972/6660 23:59 < 29:46, 2.06 it/s, Epoch 4.46/10]
Epoch Training Loss Validation Loss Accuracy
0 0.619400 0.565407 0.716000
1 0.508100 0.572391 0.699000
3 0.476300 0.559408 0.713000

[3071/6660 24:44 < 28:55, 2.07 it/s, Epoch 4.61/10]
Epoch Training Loss Validation Loss Accuracy
0 0.619400 0.565407 0.716000
1 0.508100 0.572391 0.699000
3 0.476300 0.559408 0.713000

In [ ]:
trainer_premise.save_state()

# Save the model and tokenizer
model_save_path = prefix_path + '/results_premise' + "/model"
tokenizer_save_path = prefix_path + '/results_premise' + "./tokenizer"
trainer_premise.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)

# Save the trainer state
trainer_premise.state.save_to_json( prefix_path + '/results_premise' + "/trainer_state.json")
In [ ]:
# Trainer
trainer_hypothesis = CustomTrainer(
    model=model_premise,
    args=training_args,
    train_dataset=train_dataset_hypothesis,
    eval_dataset=eval_dataset_hypothesis,
    data_collator=default_data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer, # Pass the tokenizer here
    callbacks=[early_stopping]
)
In [ ]:
# Train the model
trainer_hypothesis.train()

# Save the model, tokenizer, and trainer state
model_hypothesis.save_pretrained(prefix_path + '/results_premise')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results')  # Save the trained model
trainer_hypothesis.save_state()

# Save the model and tokenizer
model_save_path = prefix_path + '/results_hypothesis' + "/model"
tokenizer_save_path = prefix_path + '/results_hypothesis' + "./tokenizer"
trainer_hypothesis.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)

# Save the trainer state
trainer_hypothesis.state.save_to_json( prefix_path + '/results_hypothesis' + "/trainer_state.json")
[4000/6660 38:06 < 25:21, 1.75 it/s, Epoch 6/10]
Epoch Training Loss Validation Loss Accuracy
0 0.703500 0.686365 0.556000
1 0.685100 0.684139 0.557000
3 0.705600 0.681133 0.560000
4 0.698000 0.682720 0.544000
6 0.662500 0.681782 0.556000

Evaluation of the 2-tower ensemble on the test dataset, confusion matrix and clasification report

In [ ]:
avg_pred
Out[ ]:
tensor([0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
        0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1,
        0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
        1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0,
        0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,
        0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0,
        0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0,
        1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,
        0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0,
        0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0,
        1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0,
        0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0,
        0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1,
        0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
        1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0,
        1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0,
        1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
        0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
        0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,
        0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,
        0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
        1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,
        0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1,
        0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0,
        0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
        0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0,
        1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
        1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
        0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1,
        0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
        1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1,
        0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0,
        0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0,
        1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0,
        0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
        0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
        0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0])
In [ ]:
import numpy as np
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

predictions_premise_test = trainer_premise.predict(test_dataset_premise)
predictions_hypothesis_test = trainer_hypothesis.predict(test_dataset_hypothesis)

# Extract predicted labels
preds_premise = np.argmax(predictions_premise_test.predictions, axis=1)
preds_hypothesis = np.argmax(predictions_hypothesis_test.predictions, axis=1)

# Convert NumPy arrays to PyTorch tensors
preds_premise_tensor = torch.tensor(preds_premise, dtype=torch.float32)
preds_hypothesis_tensor = torch.tensor(preds_hypothesis, dtype=torch.float32)

# Concatenate the predictions along the second dimension
cum_preds = torch.stack([preds_premise_tensor, preds_hypothesis_tensor], dim=1)

preds = cum_preds.argmax(dim = 1)

# Compute confusion matrix
conf_matrix = confusion_matrix(test_df['label'], preds)

# Calculate confidence scores (adjusting for the 1D tensor)
confidence_scores = preds.cpu().numpy()  # Directly extract values from the 1D tensor
confidence_scores = np.where(preds == 0, 1 - confidence_scores, confidence_scores)
print(confidence_scores)

# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=conf_matrix, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1]
No description has been provided for this image
In [ ]:
from sklearn.metrics import classification_report

classification_report = classification_report(test_df['label'], preds, target_names=['Negative', 'Positive'])
print(classification_report)
              precision    recall  f1-score   support

    Negative       0.45      0.61      0.52       500
    Positive       0.40      0.26      0.31       500

    accuracy                           0.43      1000
   macro avg       0.42      0.43      0.42      1000
weighted avg       0.42      0.43      0.42      1000

Calibration report over eval_dataset

In [ ]:
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np

pred_eval_premise = trainer_premise.predict(eval_dataset_premise)

preds_premise = np.argmax(pred_eval_premise.predictions, axis=1)

pred_eval_hypothesis = trainer_hypothesis.predict(eval_dataset_hypothesis)

preds_hypothesis = np.argmax(pred_eval_hypothesis.predictions, axis=1)

preds_premise_tensor = torch.tensor(preds_premise)
preds_hypothesis_tensor = torch.tensor(preds_hypothesis)

# Concatenate the predictions along the second dimension
cum_preds = torch.stack([preds_premise_tensor, preds_hypothesis_tensor], dim=1)

preds = cum_preds.argmax(dim = 1)

# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))

# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), eval_df['label'])

# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]

probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
In [ ]:
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve

# Plot calibration curve
prob_true, prob_pred = calibration_curve(eval_df['label'], probs[:, 1], n_bins=10)

plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
No description has been provided for this image