Name : Ranjana Rajendran
Email : ranjana.rajendran@gmail.com
Topic : Spoiler Alert
Summary¶
The problem statement is that given the movie_details and revews about that movie from users, alert if the review_text is a spoiler about the movie or not.
In this study, I have performed a study of the input data to see if there is any correlation between the other features and the is_spoiler field. None could be found. The data was exploded on the genres field to see if there is any correlation between the genre and is_spoiler and none was found. The only aspect found was that as the length of review_text increased, that review is more likely to be a spoiler than not.
Given that no other aspect could be found that could contribute towards a review being a spoiler or not, we are left with natural language analysis of the review_text and other fields such as plot_synopsis or plot summary using deep learning techniques.
The review_text, plot_summary, plot_synopsis and review_summary were first pre-processed. They were converted to lowercase (as bert0-base-uncase was uncased), unnecessary white-spaces removed, unicode characters normalized, URLs, html tags and special characters removed, contractions expanded. Experiments were conducted on lemmatization. The data was persisted without lemmatization, howevere, lemmatization was done for Bidirectional LSTM model.
The following deep learning techniques were applied 🇰
- Bidirectional LSTM : An accuracy of 0.7656 was achieved on validation dataset and 0.65 on test dataset. This was done ona set of 10000 positive and 10000 negative examples sambpled from the full corpus.
- Pre-trained BERT on a sequence of review_text and plot_summary : Accuracy of 0.698 was achieved on validation dataset and 0.681 on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.
- Pre-Trained Longformer on a sequence of review_text and plot_summary: Accuracy of 0.721 was achieved on validation dataset and 0.715 on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.
- Ensemble for prediction (2 indepenently trained towers) on separately trained 2 BERT models, each on review_text and plot_synopsis separately: Accuracy of 0.43 was achieved on test dataset. This was done on a set of 5000 positive and 5000 negative examples sampled from the full corpus.
Notes on training :
- For the transformer models LoRA https://arxiv.org/abs/2106.09685 optimization was utilized.
- Early stopping was employed, utilizing the validation loss as the metric for early stopping.
- Given the limitation of time and computation available, the training could be performed only on a small set of the available dataset.
Future work
- Train each of these models on a larger dataset.
- As you have seen, Lonfomer gave the best results on review_text and plot_summary. I could use a larger dataset and plot_synopsis instead of plot_summary for Longformer and see if this will produce better results than an accuracy of 0.715 on test dataset.
Install required libraries¶
!pip install accelerate -U
exit()
Requirement already satisfied: accelerate in /usr/local/lib/python3.10/dist-packages (0.32.1)
Requirement already satisfied: numpy<2.0.0,>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (24.1)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (2.3.1+cu121)
Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from accelerate) (0.23.5)
Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from accelerate) (0.4.3)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (4.12.2)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (1.13.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.1.4)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->accelerate)
Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->accelerate)
Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.10.0->accelerate)
Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.10.0->accelerate)
Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.10.0->accelerate)
Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.10.0->accelerate)
Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.10.0->accelerate)
Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.10.0->accelerate)
Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.10.0->accelerate)
Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.10.0->accelerate)
Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.10.0->accelerate)
Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
Requirement already satisfied: triton==2.3.1 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (2.3.1)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.10.0->accelerate)
Downloading nvidia_nvjitlink_cu12-12.5.82-py3-none-manylinux2014_x86_64.whl (21.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 66.8 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->accelerate) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->accelerate) (4.66.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub->accelerate) (2024.7.4)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)
Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.82 nvidia-nvtx-cu12-12.1.105
! pip install ray
Requirement already satisfied: ray in /usr/local/lib/python3.10/dist-packages (2.32.0) Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from ray) (8.1.7) Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from ray) (3.15.4) Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from ray) (4.19.2) Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from ray) (1.0.8) Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from ray) (24.1) Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.10/dist-packages (from ray) (3.20.3) Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from ray) (6.0.1) Requirement already satisfied: aiosignal in /usr/local/lib/python3.10/dist-packages (from ray) (1.3.1) Requirement already satisfied: frozenlist in /usr/local/lib/python3.10/dist-packages (from ray) (1.4.1) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from ray) (2.31.0) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (23.2.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (2023.12.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (0.35.1) Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->ray) (0.19.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (3.7) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (2.0.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->ray) (2024.7.4)
! pip install datasets
Collecting datasets
Downloading datasets-2.20.0-py3-none-any.whl (547 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/547.8 kB ? eta -:--:--
━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━ 286.7/547.8 kB 8.4 MB/s eta 0:00:01
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 547.8/547.8 kB 9.0 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from datasets) (3.15.4)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets) (1.25.2)
Collecting pyarrow>=15.0.0 (from datasets)
Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (39.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.9/39.9 MB 43.0 MB/s eta 0:00:00
Requirement already satisfied: pyarrow-hotfix in /usr/local/lib/python3.10/dist-packages (from datasets) (0.6)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
Downloading dill-0.3.8-py3-none-any.whl (116 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 17.4 MB/s eta 0:00:00
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets) (2.0.3)
Collecting requests>=2.32.2 (from datasets)
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 10.6 MB/s eta 0:00:00
Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.10/dist-packages (from datasets) (4.66.4)
Collecting xxhash (from datasets)
Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.1/194.1 kB 26.4 MB/s eta 0:00:00
Collecting multiprocess (from datasets)
Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 19.1 MB/s eta 0:00:00
Requirement already satisfied: fsspec[http]<=2024.5.0,>=2023.1.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (2023.6.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets) (3.9.5)
Requirement already satisfied: huggingface-hub>=0.21.2 in /usr/local/lib/python3.10/dist-packages (from datasets) (0.23.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets) (24.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets) (6.0.1)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (4.0.3)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.21.2->datasets) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets) (2024.7.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2024.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.16.0)
Installing collected packages: xxhash, requests, pyarrow, dill, multiprocess, datasets
Attempting uninstall: requests
Found existing installation: requests 2.31.0
Uninstalling requests-2.31.0:
Successfully uninstalled requests-2.31.0
Attempting uninstall: pyarrow
Found existing installation: pyarrow 14.0.2
Uninstalling pyarrow-14.0.2:
Successfully uninstalled pyarrow-14.0.2
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires pyarrow<15.0.0a0,>=14.0.1, but you have pyarrow 17.0.0 which is incompatible.
google-colab 1.0.0 requires requests==2.31.0, but you have requests 2.32.3 which is incompatible.
ibis-framework 8.0.0 requires pyarrow<16,>=2, but you have pyarrow 17.0.0 which is incompatible.
Successfully installed datasets-2.20.0 dill-0.3.8 multiprocess-0.70.16 pyarrow-17.0.0 requests-2.32.3 xxhash-3.4.1
prefix_path = '/content/drive/MyDrive/ML-SwitchUP/Deep Learning Projects/NLP Project/'
Download and prepare the data¶
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
import os
import gdown
import zipfile
zip_url = "https://drive.google.com/uc?export=download&id=16zn74iUWV-Qiv4yYkaMY-spr-LJIys9Z"
dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")
zip_file_path = os.path.join(dataset_path, "imdb_spoiler_dataset.zip")
if not os.path.exists(dataset_path):
os.makedirs(dataset_path)
if not os.path.exists(zip_file_path):
gdown.download(zip_url, zip_file_path, quiet=False)
with zipfile.ZipFile(zip_file_path ,"r") as zip_ref:
zip_ref.extractall(dataset_path)
This created 2 json files imdb_reviews.json and imdb_movie_details.json
# Read reviews from json file
import json
import pandas as pd
import os
dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")
reviews_json_path = os.path.join(dataset_path, "IMDB_reviews.json")
movie_details_json_path = os.path.join(dataset_path, "IMDB_movie_details.json")
reviews = []
with open(reviews_json_path, 'r') as f:
for line in f:
reviews.append(json.loads(line))
reviews_df = pd.DataFrame(reviews)
movie_details = []
with open(movie_details_json_path, 'r') as f:
for line in f:
movie_details.append(json.loads(line))
movie_details_df = pd.DataFrame(movie_details)
Data pre-processing¶
reviews_df
| review_date | movie_id | user_id | is_spoiler | review_text | rating | review_summary | |
|---|---|---|---|---|---|---|---|
| 0 | 10 February 2006 | tt0111161 | ur1898687 | True | In its Oscar year, Shawshank Redemption (writt... | 10 | A classic piece of unforgettable film-making. |
| 1 | 6 September 2000 | tt0111161 | ur0842118 | True | The Shawshank Redemption is without a doubt on... | 10 | Simply amazing. The best film of the 90's. |
| 2 | 3 August 2001 | tt0111161 | ur1285640 | True | I believe that this film is the best story eve... | 8 | The best story ever told on film |
| 3 | 1 September 2002 | tt0111161 | ur1003471 | True | **Yes, there are SPOILERS here**This film has ... | 10 | Busy dying or busy living? |
| 4 | 20 May 2004 | tt0111161 | ur0226855 | True | At the heart of this extraordinary movie is a ... | 8 | Great story, wondrously told and acted |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 573908 | 8 August 1999 | tt0139239 | ur0100166 | False | Go is wise, fast and pure entertainment. Assem... | 10 | The best teen movie of the nineties |
| 573909 | 31 July 1999 | tt0139239 | ur0021767 | False | Well, what shall I say. this one´s fun at any ... | 9 | Go - see the movie |
| 573910 | 20 July 1999 | tt0139239 | ur0392750 | False | Go is the best movie I have ever seen, and I'v... | 10 | It's the best movie I've ever seen |
| 573911 | 11 June 1999 | tt0139239 | ur0349105 | False | Call this 1999 teenage version of Pulp Fiction... | 3 | Haven't we seen this before? |
| 573912 | 3 May 1999 | tt0139239 | ur0156431 | False | Why was this movie made? No doubt to sucker in... | 2 | Go doesn't go anywhere |
573913 rows × 7 columns
Add a review_id feature to distincguish between reviews on join.
from uuid import uuid4
reviews_df['review_id'] = reviews_df.index.to_series().map(lambda x: uuid4())
movie_details_df
| movie_id | plot_summary | duration | genre | rating | release_date | plot_synopsis | |
|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... |
| 1 | tt1204975 | Billy (Michael Douglas), Paddy (Robert De Niro... | 1h 45min | [Comedy] | 6.6 | 2013-11-01 | Four boys around the age of 10 are friends in ... |
| 2 | tt0243655 | The setting is Camp Firewood, the year 1981. I... | 1h 37min | [Comedy, Romance] | 6.7 | 2002-04-11 | |
| 3 | tt0040897 | Fred C. Dobbs and Bob Curtin, both down on the... | 2h 6min | [Adventure, Drama, Western] | 8.3 | 1948-01-24 | Fred Dobbs (Humphrey Bogart) and Bob Curtin (T... |
| 4 | tt0126886 | Tracy Flick is running unopposed for this year... | 1h 43min | [Comedy, Drama, Romance] | 7.3 | 1999-05-07 | Jim McAllister (Matthew Broderick) is a much-a... |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1567 | tt0289879 | Evan Treborn grows up in a small town with his... | 1h 53min | [Sci-Fi, Thriller] | 7.7 | 2004-01-23 | In the year 1998, Evan Treborn (Ashton Kutcher... |
| 1568 | tt1723811 | Brandon is a 30-something man living in New Yo... | 1h 41min | [Drama] | 7.2 | 2012-01-13 | Brandon (Michael Fassbender) is a successful, ... |
| 1569 | tt5013056 | Evacuation of Allied soldiers from the British... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | The film alternates between three different pe... |
| 1570 | tt0104014/ | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | |
| 1571 | tt0114142/ | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 |
1572 rows × 7 columns
# Find movies where plot synapsis length is zero
movie_details_df[movie_details_df['plot_synopsis'].str.len() == 0]
| movie_id | plot_summary | duration | genre | rating | release_date | plot_synopsis | |
|---|---|---|---|---|---|---|---|
| 2 | tt0243655 | The setting is Camp Firewood, the year 1981. I... | 1h 37min | [Comedy, Romance] | 6.7 | 2002-04-11 | |
| 10 | tt0107131 | Three pets (Chance, a young dog unfamiliar wit... | 1h 24min | [Adventure, Comedy, Drama] | 6.9 | 1993-02-12 | |
| 11 | tt0110364 | In Urbania, Ohio, snobby ex-football star Kevi... | 1h 47min | [Comedy, Family, Sport] | 6.3 | 1994-10-14 | |
| 22 | tt0101700 | The story is centered on a microcosm of a post... | 1h 39min | [Comedy, Crime] | 7.7 | 1992-04-03 | |
| 24 | tt0102768 | Henry is a lawyer who survives a shooting only... | 1h 48min | [Drama, Romance] | 6.7 | 1991-07-10 | |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1556 | tt0285531 | Four childhood friends, Jonesy, Beaver, Pete a... | 2h 14min | [Drama, Horror, Sci-Fi] | 5.5 | 2003-03-21 | |
| 1564 | tt0914798 | Young Bruno lives a wealthy lifestyle in prewa... | 1h 34min | [Drama, War] | 7.8 | 2008-11-26 | |
| 1566 | tt4047038 | Centers on the titular holistic detective who ... | 1h | [Comedy, Mystery, Sci-Fi] | 8.4 | 2017-04-01 | |
| 1570 | tt0104014/ | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | |
| 1571 | tt0114142/ | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 |
233 rows × 7 columns
movie_details_df[movie_details_df['plot_summary'].str.len() == 0]
| movie_id | plot_summary | duration | genre | rating | release_date | plot_synopsis |
|---|
There are no movies with empty plot_summary
Let us examine the text of a plot summary for a movie for which plot synopsis is empty.
movie_details_df[movie_details_df['movie_id'] == 'tt0243655']['plot_summary'].item()
"The setting is Camp Firewood, the year 1981. It's the last day before everyone goes back to the real world, but there's still a summer's worth of unfinished business to resolve. At the center of the action is camp director Beth, who struggles to keep order while she falls in love with the local astrophysics professor. He is busy trying to save the camp from a deadly piece of NASA's Skylab which is hurtling toward earth. All that, plus: a dangerous waterfall rescue, love triangles, misfits, cool kids, and talking vegetable cans. The questions will all be resolved, of course, at the big talent show at the end of the day. Written by\nSujit R. Varma"
Let us examine a plot synopsis and summary for which the plot syopsis is not empty
movie_details_df[movie_details_df['plot_synopsis'].str.len() != 0]
| movie_id | plot_summary | duration | genre | rating | release_date | plot_synopsis | |
|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... |
| 1 | tt1204975 | Billy (Michael Douglas), Paddy (Robert De Niro... | 1h 45min | [Comedy] | 6.6 | 2013-11-01 | Four boys around the age of 10 are friends in ... |
| 3 | tt0040897 | Fred C. Dobbs and Bob Curtin, both down on the... | 2h 6min | [Adventure, Drama, Western] | 8.3 | 1948-01-24 | Fred Dobbs (Humphrey Bogart) and Bob Curtin (T... |
| 4 | tt0126886 | Tracy Flick is running unopposed for this year... | 1h 43min | [Comedy, Drama, Romance] | 7.3 | 1999-05-07 | Jim McAllister (Matthew Broderick) is a much-a... |
| 5 | tt0286716 | Bruce Banner, a brilliant scientist with a clo... | 2h 18min | [Action, Sci-Fi] | 5.7 | 2003-06-20 | Bruce Banner (Eric Bana) is a research scienti... |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1563 | tt0120655 | An abortion clinic worker with a special herit... | 2h 10min | [Adventure, Comedy, Drama] | 7.3 | 1999-11-12 | The film opens with a homeless man (Bud Cort) ... |
| 1565 | tt0276751 | Twelve year old Marcus Brewer lives with his c... | 1h 41min | [Comedy, Drama, Romance] | 7.1 | 2002-05-17 | Will Freeman (Hugh Grant) is a 38-year-old bac... |
| 1567 | tt0289879 | Evan Treborn grows up in a small town with his... | 1h 53min | [Sci-Fi, Thriller] | 7.7 | 2004-01-23 | In the year 1998, Evan Treborn (Ashton Kutcher... |
| 1568 | tt1723811 | Brandon is a 30-something man living in New Yo... | 1h 41min | [Drama] | 7.2 | 2012-01-13 | Brandon (Michael Fassbender) is a successful, ... |
| 1569 | tt5013056 | Evacuation of Allied soldiers from the British... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | The film alternates between three different pe... |
1339 rows × 7 columns
movie_details_df[movie_details_df['movie_id'] == 'tt0105112']['plot_summary'].item()
"Former CIA analyst, Jack Ryan is in England with his family on vacation when he suddenly witnesses an explosion outside Buckingham Palace. It is revealed that some people are trying to abduct a member of the Royal Family but Jack intervenes, killing one of them and capturing the other, and stops the plan in its tracks. Afterwards, he learns that they're Irish revolutionaries and the two men are brothers. During his court hearing the one that's still alive vows to get back at Jack but is sentenced and that seems to be the end of it. However, whilst the man is being transported, he is broken out. Jack learns of this but doesn't think there's anything to worry about. But, when he is at the Naval Academy someone tries to kill him. He learns that they are also going after his family and so he rushes to find them, safe but having also been the victims of a failed assassination. That's when Jack decides to rejoin the CIA, and they try to find the man before he makes another attempt. Written by\nrcs0411@yahoo.com"
movie_details_df[movie_details_df['movie_id'] == 'tt0105112']['plot_synopsis'].item()
'Jack Ryan (Ford) is on a "working vacation" in London with his family. He has retired from the CIA and is a Professor at the US Naval Academy. He is seen delivering a lecture at the Royal Naval Academy in London.Meanwhile, Ryan\'s wife Cathy and daughter Sally are sightseeing near Buckingham Palace. Sally and Cathy come upon a British Royal Guard, and Sally tries to get the guard to react by doing an improvised tap dance in front of him. She\'s impressed when the guard, trained to ignore distraction, doesn\'t react at all, and they leave.As Sally and Cathy walk away from the guard, en route to rendezvous with Ryan, they walk by a stolen cab, in which sit three Ulster Liberation Army terrorists: Kevin O\'Donnell, the driver, as well as Sean Miller (Sean Bean) and his younger brother Patrick. The three are loading bullets into their guns as they prepare to carry out a scheduled ambush on Lord William Holmes, British Secretary of State for Northern Ireland and a distant member of the British Royal Family (the cousin of the Queen Mother). As they finish loading their weapons, another terrorist, Annette, radios to them that Lord Holmes is leaving the palace with his wife and son.As Lord Holmes\'s car leaves the palace, the terrorists\' car falls in line behind him. They follow the car, headed for the ambush point. As they drive, O\'Donnell and the Miller brothers don ski masks.Meanwhile, Ryan is about to cross the street to meet with his wife and daughter. He arrives at the same time that Lord Holmes\'s car is passing through the area.Suddenly, another cab, parked on the side of the road, driven by Annette, pulls out and swerves in front of Lord Holmes\'s vehicle. Simultaneously, O\'Donnell\'s vehicle swerves to a stop behind the car. The Millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car\'s engine. Ryan sees what\'s going on and hastily gets his family to cover just as the bomb is detonated. Moments later, Sean and another accomplice begin shooting into the car, riddling Lord Holmes\'s driver and bodyguard with bullet. The onslaught of bullets does not kill Lord Holmes, who is currently using his body to shield his family. Sean marches over to the rear door and orders Lord Holmes out of the car at gunpoint. Ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind Miller, and disarms him. As Ryan grabs Sean\'s pistol, he gets shot in the left shoulder by another terrorist. He shoots one of the other terrorists in the shoulder. Patrick attempts to flee, but Ryan fatally shoots him in the chest. Seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving Sean to be captured when the authorities arrive. They retreat back to a warehouse, where they change out license plates.While recovering, Ryan is called to testify against Miller as a witness. Subsequently, Miller is convicted on all charges and sentenced to life in prison. Ryan is awarded the order of Knight Commander of the Victorian Order, and eventually returns to the United States.On the possibility that Miller\'s ULA comrades might try to liberate him, the authorities take no chances. To that extent, they set up decoy convoys on the day that Miller is to be transferred to Albany Prison on the Isle of Wight. However, someone manages to tip off the ULA as to which convoy is carrying Miller. Miller\'s motorcade is in transit when it comes to a stop at a drawbridge. The raised bridge, however, is a trap meant to block them in. Moments after the convoy comes to a stop, Miller\'s comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. They then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. One of the guards hesitates and is shot fatally. The other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. O\'Donnell hands Miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. They then depart the scene.Miller and his companions flee on a cargo ship from Britain to Libya, to prepare for their next kidnapping attempt on Lord Holmes. Miller however, cannot shake his anger towards Ryan for killing his younger brother and persuades several members of his entourage to accompany him to the United States on a short mission targeting Ryan and his family.Miller travels to the United States, accompanied by Annette, comrade Ned Clark, and a fourth henchman. Annette and Clark travel to the United States Naval Academy to ambush Ryan as he\'s leaving work. Ryan notices Clark idling nearby. Clark casually walks away, but Ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. His suspicions are well warranted-as he continues walking down the street, he looks in the reflection of a parked van\'s rearview door and sees that Clark, and the stolen car driven by Annette, are following him. Ryan pretends not to see them until he\'s walking past a parked panel truck, at which point he ducks behind the truck. Clark draws a silenced pistol and prepares to corner Ryan, but Ryan attacks him first. Clark quickly gets the upper hand and throws Ryan to the curb. He grabs his pistol and is about to shoot Ryan when he\'s shot dead himself by a US Marine sentry. Annette quickly drives away in her Jeep. Ryan, realizing that Miller is going to target Cathy and Sally, jumps into his car and drives off.Simultaneously, Miller and the other henchman follow Cathy as she picks up Sally from school, with Miller riding shotgun. They prepare to follow the two, but are temporarily held up by a crossing guard. Once the crossing guard clears, they pursue her to US Highway 50. Ryan tries to frantically call Cathy on his car phone, but she is busy answering phone calls from the hospital. As he\'s trying to connect with her, Miller and his henchman recklessly weave in and out of traffic to catch up with Cathy. Ryan eventually gets through to Cathy and tells her to get off the highway and find the nearest police station. Unfortunately, just as the call ends, Miller\'s van comes charging up on her using the left shoulder. Miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. Another car momentarily delays him, but once he has a clear aim, Miller opens fire, spraying Cathy\'s car with bullets. A multicar pileup ensues as Cathy loses control of her car and crashes head-on into a concrete barrier. Ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.Cathy and Sally are airlifted to a nearby hospital for treatment. Enraged over the near-loss of his family, Ryan decides to go back to work for the CIA, having earlier rejected the appeal of his former superior, Vice Admiral James Greer (James Earl Jones).Ryan\'s tireless work leads him to conclude that Miller has taken refuge in a training camp, one of many located in Libya. He also determines that a recent new arrival at the camp may be one of O\'Donnell\'s English informants, Dennis Cooley. Ryan makes his recommendations to his superiors at the CIA. Greer invites Jack to a live feed of an SAS strike team attack on the camp. Everyone in the camp is eliminated but unbeknownst to the CIA and Ryan, Miller and his companions have already fled the camp and are on their way to the US to stage their next attack on Lord Holmes.Lord Holmes decides to visit Ryan at his home to formally present his KCVO. With the aid of Lord Holmes\' traitorous assistant, Miller\'s group tracks Holmes to this location, kills the Diplomatic Security Service agents and Maryland state troopers guarding the house, and attempts once more to kidnap Lord Holmes. Ryan leads Holmes and his family to safety while he attempts to lure Miller and his companions away from his home.The FBI Hostage Rescue Teams are scrambled to pick up Holmes. Upon realising that Ryan is leading them away from Holmes, Miller\'s companions try to persuade Miller to turn around, but an enraged and deranged Miller kills his terrorist companions and continues his pursuit of Ryan. Ryan and Miller fight hand to hand; Miller is killed when Ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.Credits roll just after Caroline Ryan learns the gender of the child she is going to have, and before she tells Jack and Sally.'
The plot synopsis looks too lengthy and may not make much sense. Let us look at the reveiws for this movie and see which ones are spoilers and not.
Look for null values¶
# Do all reviews have movie_id associated
reviews_df[reviews_df['movie_id'].isna()]
| review_date | movie_id | user_id | is_spoiler | review_text | rating | review_summary | review_id |
|---|
reviews_df.isna().sum()
review_date 0 movie_id 0 user_id 0 is_spoiler 0 review_text 0 rating 0 review_summary 0 review_id 0 dtype: int64
movie_details_df.isna().sum()
movie_id 0 plot_summary 0 duration 0 genre 0 rating 0 release_date 0 plot_synopsis 0 dtype: int64
Join reviews and movie_details¶
## Join reviews and movie_detail
movie_reviews = pd.merge(movie_details_df, reviews_df, on = ['movie_id'], how = 'outer', indicator = True)
movie_reviews.head()
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | The second Tom Clancy novel made into a film (... | 6 | decent Clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | The second in what looks like becoming the 'Ja... | 7 | Jack Ryan Chapter II | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | I was not a fan of The Hunt For Red October. I... | 8 | It's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | Jack Ryan (Harrison Ford) is a CIA analyst who... | 7 | Patriotic Gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | This was one of the big summer movies of 1992.... | 7 | Harrison Ford In Another Tom Clancy Adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
# movies without any review
movie_reviews[movie_reviews['_merge'] == 'left_only']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 573906 | tt0104014/ | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | left_only | |
| 573907 | tt0114142/ | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | left_only |
# reviews without a movie_id
movie_reviews[movie_reviews['_merge'] == 'right_only']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 573908 | tt0104014 | NaN | NaN | NaN | NaN | NaN | NaN | 29 August 2006 | ur0039450 | False | This must be one of the most erotic films ever... | 10 | Erotic with an big E | c4187589-e37f-4af4-a28d-d22d864c3667 | right_only |
| 573909 | tt0104014 | NaN | NaN | NaN | NaN | NaN | NaN | 16 July 2012 | ur5358902 | False | Tinto Brass is usually referred to as either a... | 7 | Silly and Sexy | 83ddfdd4-f4eb-4dbb-8b43-34c09531e83e | right_only |
| 573910 | tt0104014 | NaN | NaN | NaN | NaN | NaN | NaN | 25 October 2015 | ur63416246 | False | One of the best erotic movies of all times. Mo... | 8 | Excellento | 66bb64ee-b216-4809-bff0-d7ddf2bfbed8 | right_only |
| 573911 | tt0104014 | NaN | NaN | NaN | NaN | NaN | NaN | 14 February 2006 | ur2851731 | False | call me crass, but i preferred the higher titi... | 3 | horrible male lead | 08f1b8d5-2801-4b4c-9348-7ceb54988a44 | right_only |
| 573912 | tt0104014 | NaN | NaN | NaN | NaN | NaN | NaN | 11 April 2008 | ur10916790 | False | I heard Tinto Brass did eroticism, fine. I hea... | 1 | Go rent a playboy video | 1931f6ef-d218-49c3-968f-661406c64ec7 | right_only |
| 573913 | tt0114142 | NaN | NaN | NaN | NaN | NaN | NaN | 25 July 2010 | ur6302267 | False | Yet another erotic thriller involving a web of... | 3 | The women are hot, things end there | 7fd32d47-5622-40c9-9c86-aa419d027e10 | right_only |
| 573914 | tt0114142 | NaN | NaN | NaN | NaN | NaN | NaN | 27 December 2000 | ur0982623 | False | Just a standard softcore flick from the playbo... | 8 | Standard playboy production - bad script, sexy... | 4e1bbcec-957d-454c-9d5b-63a173238d51 | right_only |
Note that all movies don't have reviews and all reviews don't have a movie associated with it/
Observe that tt0104014/ should be tt0104014 in movie_details. Let us see if there aremore of such formatting erros in movie_id as that is the foreign key between the 2 tables reviews_df and movie_details_df.
movie_details_df[movie_details_df['movie_id'].astype(str).str.contains('/')]
| movie_id | plot_summary | duration | genre | rating | release_date | plot_synopsis | |
|---|---|---|---|---|---|---|---|
| 1570 | tt0104014/ | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | |
| 1571 | tt0114142/ | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 |
movie_details_df['movie_id'] = movie_details_df['movie_id'].str.replace('/', '')
movie_details_df['movie_id'] = movie_details_df['movie_id'].str.replace('/', '')
movie_reviews = pd.merge(movie_details_df, reviews_df, on = ['movie_id'], how = 'outer', indicator = True)
movie_reviews.head()
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | The second Tom Clancy novel made into a film (... | 6 | decent Clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | The second in what looks like becoming the 'Ja... | 7 | Jack Ryan Chapter II | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | I was not a fan of The Hunt For Red October. I... | 8 | It's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | Jack Ryan (Harrison Ford) is a CIA analyst who... | 7 | Patriotic Gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | This was one of the big summer movies of 1992.... | 7 | Harrison Ford In Another Tom Clancy Adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
movie_reviews[movie_reviews['_merge'] == 'left_only']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge |
|---|
movie_reviews[movie_reviews['_merge'] == 'right_only']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge |
|---|
Now, there are no more reviews which don't have a movie associated to it and vice versa.
movie_reviews[movie_reviews['_merge'] == 'both']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | The second Tom Clancy novel made into a film (... | 6 | decent Clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | The second in what looks like becoming the 'Ja... | 7 | Jack Ryan Chapter II | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | I was not a fan of The Hunt For Red October. I... | 8 | It's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | Jack Ryan (Harrison Ford) is a CIA analyst who... | 7 | Patriotic Gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | Former CIA analyst, Jack Ryan is in England wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | Jack Ryan (Ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | This was one of the big summer movies of 1992.... | 7 | Harrison Ford In Another Tom Clancy Adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 573908 | tt0104014 | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | 25 October 2015 | ur63416246 | False | One of the best erotic movies of all times. Mo... | 8 | Excellento | 66bb64ee-b216-4809-bff0-d7ddf2bfbed8 | both | |
| 573909 | tt0104014 | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | 14 February 2006 | ur2851731 | False | call me crass, but i preferred the higher titi... | 3 | horrible male lead | 08f1b8d5-2801-4b4c-9348-7ceb54988a44 | both | |
| 573910 | tt0104014 | For a while now, beautiful 24-year-old Diana B... | 1h 33min | [Comedy, Drama] | 5.3 | 1992-02-21 | 11 April 2008 | ur10916790 | False | I heard Tinto Brass did eroticism, fine. I hea... | 1 | Go rent a playboy video | 1931f6ef-d218-49c3-968f-661406c64ec7 | both | |
| 573911 | tt0114142 | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 | 25 July 2010 | ur6302267 | False | Yet another erotic thriller involving a web of... | 3 | The women are hot, things end there | 7fd32d47-5622-40c9-9c86-aa419d027e10 | both | |
| 573912 | tt0114142 | The marriage of David Burgess, a senior execut... | 1h 32min | [Drama, Thriller] | 4.0 | 1999-01-29 | 27 December 2000 | ur0982623 | False | Just a standard softcore flick from the playbo... | 8 | Standard playboy production - bad script, sexy... | 4e1bbcec-957d-454c-9d5b-63a173238d51 | both |
573913 rows × 15 columns
Let us look at the reviws for movie_id 'tt0105112' we were looking at before.
As the problem here is regarding the review_text being a spoiler or not, let us focus on that. Let us look at reviews which are marked true for spoiler vs not true.
movie_reviews[(movie_reviews['movie_id'] == 'tt0105112') & (movie_reviews['is_spoiler'] == True)]['review_text'].head(1).item()
'The second Tom Clancy novel made into a film (\'The Hunt For Red October\' being the first), \'Patriot Games\' opts for a somewhat smaller scale. There\'s no chance World War III will result from anyone\'s actions here, but the CIA finds enough bad business coming from the Irish Republican Army to keep itself and those amazing spy satellites it employs busy. Harrison Ford is now Clancy\'s reliable hero, Jack Ryan, and Ford is a very good fit. In best Clancy fashion, Ryan is a very reluctant hero as well. He hates getting involved in other peoples\' affairs and being forced to settle them, just as he hates being forced to kick ass, but what\'s an ex-Marine CIA agent to do? The villains this time are an ultraviolent faction of the IRA; Ryan gets on their bad side when he busts up an attempt by them to kill the British Home Secretary and his family. The rest of the movie is essentially a tit-for-tat series of confrontations between Ryan/CIA and these IRA murderers. The best scenes in \'Patriot Games\' are the action ones, as well as the parts showing the CIA\'s supersecret and supersophisticated technology at work. Less successful are the scenes having to do with Ryan\'s home life, and his wife and daughter. There is an upper-class, peachy-keen smugness to them, exemplified by Anne Archer (Cathy Ryan) and her "boy, this CIA stuff is a pain in the ass" attitude. This type of material is not Clancy\'s strong suit (or the screenwriter\'s, apparently) and it shows. The trio of Irish baddies are all portrayed convincingly; Patrick Bergin, Polly Walker, and particularly Sean Bean in a menacing and venomous performance. \'Patriot Games\' succeeds mainly because of Harrison Ford\'s tough, likable, and sometimes vulnerable presence. Several key plot points do not bear close scrutiny; it strikes one as unlikely the IRA and the CIA would take on one another over what is essentially a personal vendetta. But don\'t think too hard or too much and you\'ll find \'Patriot Games\' a reasonably exciting thriller.'
movie_reviews[(movie_reviews['movie_id'] == 'tt0105112') & (movie_reviews['is_spoiler'] == False)]['review_text'].head(1).item()
'If I were to make a quick summary of this movie, it would be "a good, but not great, action flick." I don\'t think this movie was intended to be innovative, to be great, to be memorable. What it was intended to be was entertaining, and it lived up to this expectation.Harrison Ford is the master of remaining as low-key as possible. He has had so many films where he can over-emphasize, over-dramatize, and simply over-act. Ford has the maturity to keep this from happening, and he does so again in this film. Ford is supported quite well with such actors/actresses as James Earl Jones (reprising his role as Greer from The Hunt for Red October), Samuel L. Jackson, and Anne Archer. For an action flick, acting is much above average.It was also suspenseful...thrilling. I thought the ending had a nice build-up and climax. I realize this is different from the book\'s ending...but remember, these are two different visions...not the same one. This brings up a big point in movie-watching: Comparing a movie to a book is the worst thing you can do. They are two completely different genres. More importantly, as I said before, they are two different visions.I\'m not a big action fan. I used to be, but anymore most stories are the same. To this end I would say Patriot Games is fairly mainstream. However, it was always fun to watch and sometimes thrilling to watch. I\'m also not a big Tom Clancy fan, but I\'ve given the movies based off his books a shot because again they are different visions. The result is (at least with Patriot Games) a good film to watch on a Saturday night with a bowl of popcorn. Don\'t expect great things, but don\'t expect to be disappointed, either.Rating: 7/10'
The review_text follows from the plot summary. Plot synopsis could have too much information not in the plot summary. Given the limit on the seq length acceptable for the deep learning models, plot_summary could be a more useful feature.
Examine the reviews for the movie for which plot synopsis is of length 0
movie_reviews[movie_reviews['movie_id'] == 'tt0243655']['is_spoiler'].value_counts()
is_spoiler False 193 True 20 Name: count, dtype: int64
movie_reviews[(movie_reviews['movie_id'] == 'tt0243655') & (movie_reviews['is_spoiler'])]['review_text'].head(1).item()
'"Not ANOTHER \'Another Teen Movie,\'" is the accompanying groan to most introductions to Wet Hot American Summer. How could a title and premise that simultaneously stupid and generic be anything more than the sum of its exorbitantly familiar parts? It doesn\'t help that every plot summary reads roughly exactly how you\'d expect it to; even stray quotations out of context play like recycled outtakes from American Pie. If anything, it seems amazing how barefaced the film is about its complete unoriginality. So do yourself a favour \x96 stop reading plot summaries, perusing online quotes, and just go see the film instead. Don\'t watch the trailer. Even stop reading this review. Stop it. Please stop.What are you doing still reading? I asked you nicely and everything! \x85I have failed at my objective.The inevitable twist, of course: all of the above claims are actually true. And yet, somehow, Wet Hot American Summer remains an almost uncomfortably enjoyable watch. This is a film that gleefully tries to hide how far-removed from the film you think it\'s going to be, patiently wearing the mask of a normal \'Murrcan teenage coming-of-age camp drama, filled with first kisses, first times, illicit substances, and plenty of second-hand embarrassment. Don\'t get me wrong: those things are all still there. But the schmaltz of the standard framework has been jettisoned for something altogether livelier, funnier, and even more honest. \'Parody\' is the term used to encompass the film\'s approach to genre, but it may as well function as a parody of expectations of normal narrative beats and cohesion. It\'s almost as if writer/director David Wain and writer/star Michael Showalter have tossed a pinch of magic (or at least \'special sauce\') into the pressure cooker of rote garbage and let it rip with no lid, exploding everywhere in a mess too delicious to be turned down (or something\x85 I think that metaphor got away from me). The best joke of all? It works. Weirdly well. No, this is a film whose heart and inspiration lies somewhere between the SNL-spin off insanity of Meatballs and National Lampoon, blended with a healthy helping of the surrealistic insanity of Monty Python, and culminating in just enough heartfelt personal experience to feel strangely honest, in spite of the trappings. Narratively, it plays like a collection of improv sketches with only a rough template to eventually veer back to \x96 an Anchorman before Anchorman, if you will. Wain and Showalter take as much pleasure in making you fall in love with their earnest, archetypal characters, as they do pulling the rug out from what you would traditionally expect or even want to happen to them. Zaniness abounds, but somehow, by pushing the tropes to hyperbolic, with a side helping of irreverent and insane, ye olde teenage love subplot clichés feel far more heartfelt and credible than they would in the far flashier and stupider films that birthed them. Like any good comedy, this is a film that establishes its own internal world of conventions and possibilities. Yes, this is a world where Paul Rudd, upon finding out that, due to his negligence as a lifeguard, one of his campers has drowned, responds by offhandedly disposing of the witness\x85 and we\'re expected to laugh. And if laugh you do, the film\'s got another one for ya. Some of the bits are shrewder satirical genius (the "into town" sequence hysterically lampoons every parent\'s stern lectures about the escalation of substance use), but the film steadily gives in to being willfully deranged, building to a finale so ridiculous it feels like the apex of a drug trip\x85and, thus, entirely appropriate. In many ways, the film\'s biggest gag is its oddly prophetic assembly of actors who would go on to become comedic superstars, and an extra level of hilarity is derived from watching them irreverently having a blast here. Rudd steals the show, exquisitely uncompromising as the camp\'s resident pretty-boy-dirtbag, but Amy Poehler is a close second, perfecting her routine personas as both earnestly inspirational and ferocious control freak. Bradley \'blink-and-you- might-recognize-him\' Cooper, of all people, ends up headlining a romantic subplot that leads to the most unexpectedly tender moments of the film, while Christopher Meloni unabashedly steals every scene he appears in, a comedic triumph as the camp\'s shellshocked cook. Janeanne Garofalo and David Hyde-Pierce are both brilliantly in their element, blending sardonic barbs and rambling sweetness, as the camp\'s director and resident astrophysicist respectively, while Showalter himself is adorably fumbling as the lovestruck nerd (alongside a surprise cameo appearance) and Marguerite Moreau delivers some of the film\'s most incredibly acerbic dialogue while impressively never dropping her beaming smile. Molly Shannon uncorks a hysterically belligerent meltdown like none other, and Elizabeth Banks is hilariously dopey as one of the camp\'s seemingly endless slew of nymphomaniacs (go figure). Comedy this bonkers is certainly not for all tastes, but it\'s worth at least conceding which tastes the film should appeal to, which the film\'s latent cult success should attest to. Ultimately, in Wet Hot American Summer, coherence doesn\'t really matter: like an appropriately wild party, by the end of the film, it\'s hard to remember who was involved in which subplot, and who\'s kissed who, until you realize it hardly matters, and to just go along with the ride, and leave with Jefferson Starship\'s "Jane" still pounding in your head. Now, if you\'ll excuse me, I\'m going to go hump the refrigerator. -8/10'
movie_reviews[(movie_reviews['movie_id'] == 'tt0243655') & (movie_reviews['is_spoiler'] == False)]['review_text'].head(1).item()
'The strangest point in this film is a point, about a half hour through, when all of the seemingly normal camp counselors go out in to the city. In a montage shot, they slowly go from reading books at the library, to smoking cigarettes, to smoking joints, to buying cocaine from a guy on the street, to becoming prostitutes, then becoming strung out heroin-addicts at the local crack house. This is not your parent\'s parody movie.From this point on, the film is never the same. Seemingly normal character development goes out the window, and characters jump from one complete different personality to another within seconds. We get brilliant lines of dialogue that could only be brought from members of The State, like: "Hey, there\'s a problem. I\'ve got something I need to tell you." "Oh no! You have crabs." "No. Well, yeah, but that\'s not the problem." "Oh good."Pure genius!!! Why don\'t more comedies have lines of dialogue like this?Some other great points: "There is a way we could save everyone\'s lives. Well, no that couldn\'t work. In order for it to work, we would need to have a device that could randomly generate numbers between 1 and 20." "That\'s impossible. That would take some sort of highly advanced supercomputer to work." "Not necessarily. No dungeon master goes anywhere without his...20 sided die."I sort of wish I had known what films were being parodied in this movie. I saw strands that sort of matched what I had seen from old Summer Camp movies I had seen back in high school during the wee hours in the morning during comedy central. But, there were many things here that seemed like they must be references to other films. Oh well. It\'s just all crazy.Anyway, I guess I should say that this isn\'t a really good...movie, per se, but...well, I have no excuses for it. Wet Hot American Summer rules! Dolphins suck it!!!'
The review_text for this movie looks arbitrarily marked as spoiler or not irrespective of they have the plot or not. So we will remove those movies from the dataset for which the plot_synopsis is empty.
movie_reviews = movie_reviews[movie_reviews['plot_synopsis'].str.len() != 0]
movie_reviews[movie_reviews['movie_id']== 'tt0243655']
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge |
|---|
Preprocess all text columns for transfomer tokenization¶
Lowercasing¶
# Lowercasing
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: x.lower())
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: x.lower())
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: x.lower())
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: x.lower())
movie_reviews
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | the second tom clancy novel made into a film (... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | jack ryan (harrison ford) is a cia analyst who... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 573901 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur78983408 | False | dunkirk is a beautifully done movie that has h... | 10 | possibly the greatest war film ever made | ef14e72a-77cd-4b9c-9f85-37d01c55fded | both |
| 573902 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur53645293 | False | dunkirk is one of the rare cases a film receiv... | 10 | why this could actually be one of the best war... | d4bfc31d-ae8e-4e48-b616-936132153630 | both |
| 573903 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 31 July 2017 | ur45202903 | False | film gave insufficient background on what was ... | 2 | disappointed | e47d2365-78d9-44c1-86e8-9c02d190b0cb | both |
| 573904 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 28 July 2017 | ur1319404 | False | in screen writing, a shot is an image captured... | 7 | dunkirk for millennials | d985d458-0b1b-4cec-a5d2-6e9431770ce0 | both |
| 573905 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 25 July 2017 | ur4161069 | False | in a movie that entirely engulfs you it's rath... | 10 | christopher nolan reaches yet another new high... | 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 | both |
538828 rows × 15 columns
Removing unnecessary white spaces¶
# Removing unnecessary whitespaces
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: ' '.join(x.split()))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: ' '.join(x.split()))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: ' '.join(x.split()))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: ' '.join(x.split()))
movie_reviews
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | the second tom clancy novel made into a film (... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | jack ryan (harrison ford) is a cia analyst who... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 573901 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur78983408 | False | dunkirk is a beautifully done movie that has h... | 10 | possibly the greatest war film ever made | ef14e72a-77cd-4b9c-9f85-37d01c55fded | both |
| 573902 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur53645293 | False | dunkirk is one of the rare cases a film receiv... | 10 | why this could actually be one of the best war... | d4bfc31d-ae8e-4e48-b616-936132153630 | both |
| 573903 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 31 July 2017 | ur45202903 | False | film gave insufficient background on what was ... | 2 | disappointed | e47d2365-78d9-44c1-86e8-9c02d190b0cb | both |
| 573904 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 28 July 2017 | ur1319404 | False | in screen writing, a shot is an image captured... | 7 | dunkirk for millennials | d985d458-0b1b-4cec-a5d2-6e9431770ce0 | both |
| 573905 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 25 July 2017 | ur4161069 | False | in a movie that entirely engulfs you it's rath... | 10 | christopher nolan reaches yet another new high... | 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 | both |
538828 rows × 15 columns
Normalize unicode characters¶
# Normalize unicode characters
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: x.encode('ascii', 'ignore').decode('ascii'))
import unicodedata
def remove_accents(input_str):
nfkd_form = unicodedata.normalize('NFKD', input_str)
return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: remove_accents(x))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: remove_accents(x))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: remove_accents(x))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: remove_accents(x))
movie_reviews
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 5 March 2008 | ur16517420 | True | the second tom clancy novel made into a film (... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 October 2007 | ur13887584 | True | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 16 September 2013 | ur2403537 | True | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 15 May 2007 | ur0819382 | True | jack ryan (harrison ford) is a cia analyst who... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan (ford) is on a "working vacation" in... | 29 February 2004 | ur2567338 | True | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 573901 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur78983408 | False | dunkirk is a beautifully done movie that has h... | 10 | possibly the greatest war film ever made | ef14e72a-77cd-4b9c-9f85-37d01c55fded | both |
| 573902 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur53645293 | False | dunkirk is one of the rare cases a film receiv... | 10 | why this could actually be one of the best war... | d4bfc31d-ae8e-4e48-b616-936132153630 | both |
| 573903 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 31 July 2017 | ur45202903 | False | film gave insufficient background on what was ... | 2 | disappointed | e47d2365-78d9-44c1-86e8-9c02d190b0cb | both |
| 573904 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 28 July 2017 | ur1319404 | False | in screen writing, a shot is an image captured... | 7 | dunkirk for millennials | d985d458-0b1b-4cec-a5d2-6e9431770ce0 | both |
| 573905 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 25 July 2017 | ur4161069 | False | in a movie that entirely engulfs you it's rath... | 10 | christopher nolan reaches yet another new high... | 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 | both |
538828 rows × 15 columns
movie_reviews.head(1)['plot_synopsis'].item()
'jack ryan (ford) is on a "working vacation" in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan\'s wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she\'s impressed when the guard, trained to ignore distraction, doesn\'t react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists: kevin o\'donnell, the driver, as well as sean miller (sean bean) and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family (the cousin of the queen mother). as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes\'s car leaves the palace, the terrorists\' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o\'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes\'s car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes\'s vehicle. simultaneously, o\'donnell\'s vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car\'s engine. ryan sees what\'s going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes\'s driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean\'s pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller\'s ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller\'s motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller\'s comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o\'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he\'s leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warranted-as he continues walking down the street, he looks in the reflection of a parked van\'s rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he\'s walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he\'s shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he\'s trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller\'s van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy\'s car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes head-on into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the near-loss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer (james earl jones).ryan\'s tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o\'donnell\'s english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes\' traitorous assistant, miller\'s group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller\'s companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand; miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally.'
Remove URLs, HTML tags and special characters¶
# Remove URLs. HTML tags and special characters
import re
def remove_urls(text):
# Remove URLs
text = re.sub(r'http\S+', '', text) # Remove http URLs
text = re.sub(r'www\S+', '', text) # Remove www URLs
# Remove HTML tags
text = re.sub(r'<.*?>', '', text)
# Remove special characters
text = re.sub(r'[^\w\s.,?!\']', '', text) # Remove special characters
return text
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: remove_urls(x))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: remove_urls(x))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: remove_urls(x))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: remove_urls(x))
movie_reviews.head(1)['plot_synopsis'].item()
"jack ryan ford is on a working vacation in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she's impressed when the guard, trained to ignore distraction, doesn't react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists kevin o'donnell, the driver, as well as sean miller sean bean and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaves the palace, the terrorists' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan sees what's going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean's pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller's ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller's motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller's comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he's leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warrantedas he continues walking down the street, he looks in the reflection of a parked van's rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he's walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he's shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he's trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller's van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy's car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes headon into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally."
movie_reviews
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 5 March 2008 | ur16517420 | True | the second tom clancy novel made into a film '... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 October 2007 | ur13887584 | True | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 16 September 2013 | ur2403537 | True | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 May 2007 | ur0819382 | True | jack ryan harrison ford is a cia analyst who s... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | [Action, Thriller] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 29 February 2004 | ur2567338 | True | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 573901 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur78983408 | False | dunkirk is a beautifully done movie that has h... | 10 | possibly the greatest war film ever made | ef14e72a-77cd-4b9c-9f85-37d01c55fded | both |
| 573902 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur53645293 | False | dunkirk is one of the rare cases a film receiv... | 10 | why this could actually be one of the best war... | d4bfc31d-ae8e-4e48-b616-936132153630 | both |
| 573903 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 31 July 2017 | ur45202903 | False | film gave insufficient background on what was ... | 2 | disappointed | e47d2365-78d9-44c1-86e8-9c02d190b0cb | both |
| 573904 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 28 July 2017 | ur1319404 | False | in screen writing, a shot is an image captured... | 7 | dunkirk for millennials | d985d458-0b1b-4cec-a5d2-6e9431770ce0 | both |
| 573905 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | [Action, Drama, History] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 25 July 2017 | ur4161069 | False | in a movie that entirely engulfs you it's rath... | 10 | christopher nolan reaches yet another new high... | 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 | both |
538828 rows × 15 columns
Expand contractions¶
# Expand contractions
import re
# Contractions dictionary
contractions = {
"aren't": "are not", "can't": "cannot", "couldn't": "could not", "didn't": "did not",
"doesn't": "does not", "don't": "do not", "hadn't": "had not", "hasn't": "has not",
"haven't": "have not", "he'd": "he had", "he'll": "he will", "he's": "he is", "i'd": "i had",
"i'll": "i will", "i'm": "i am", "i've": "i have",
"I'd": "i had", "I'll": "i will", "I'm": "i am", "I've": "i have", "isn't": "is not",
"let's": "let us", "mightn't": "might not", "mustn't": "must not", "shan't": "shall not",
"she'd": "she had", "she'll": "she will", "she's": "she is", "shouldn't": "should not",
"that's": "that is", "there's": "there is", "they'd": "they had", "they'll": "they will",
"they're": "they are", "they've": "they have", "we'd": "we had", "we're": "we are",
"we've": "we have", "weren't": "were not", "what'll": "what will", "what're": "what are",
"what's": "what is", "what've": "what have", "where's": "where is", "who'd": "who had",
"who'll": "who will", "who're": "who are", "who's": "who is", "who've": "who have",
"won't": "will not", "wouldn't": "would not", "you'd": "you had", "you'll": "you will",
"you're": "you are", "you've": "you have"
}
# Function to expand contractions
def expand_contractions(text, contractions_dict):
#text = text.replace("\\'", "'")
contractions_pattern = re.compile('({})'.format('|'.join(contractions_dict.keys())), flags=re.IGNORECASE | re.DOTALL)
def replace(match):
print(match)#
match = match.group(0)
first_char = match[0]
expanded_contraction = contractions_dict.get(match) if contractions_dict.get(match) else contractions_dict.get(match.lower())
expanded_contraction = first_char + expanded_contraction[1:]
return expanded_contraction
expanded_text = contractions_pattern.sub(replace, text)
return expanded_text
# Example usage
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: expand_contractions(x, contractions))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: expand_contractions(x, contractions))
#if expanded_contraction is None:
# return match
Streaming output truncated to the last 5000 lines.
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(4, 11), match="where's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(7, 14), match="where's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(40, 46), match="aren't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(46, 50), match="i've">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(37, 41), match="he's">
<re.Match object; span=(9, 15), match="hadn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(3, 12), match="shouldn't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(17, 23), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 48), match="isn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(7, 11), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(41, 47), match="that's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(17, 22), match="won't">
<re.Match object; span=(34, 40), match="they'd">
<re.Match object; span=(41, 44), match="i'd">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(48, 54), match="didn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(25, 31), match="that's">
<re.Match object; span=(3, 11), match="wouldn't">
<re.Match object; span=(33, 39), match="hadn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(18, 24), match="you've">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(53, 61), match="wouldn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="who'd">
<re.Match object; span=(8, 13), match="you'd">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(43, 46), match="i'm">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(1, 8), match="there's">
<re.Match object; span=(45, 52), match="there's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(37, 41), match="i'll">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 37), match="shouldn't">
<re.Match object; span=(46, 53), match="they're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(35, 41), match="what's">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(95, 100), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 49), match="i'll">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(8, 14), match="you've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(47, 54), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(44, 50), match="you've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(58, 64), match="you'll">
<re.Match object; span=(315, 320), match="who's">
<re.Match object; span=(0, 6), match="hadn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(27, 31), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(23, 28), match="let's">
<re.Match object; span=(51, 57), match="that's">
<re.Match object; span=(44, 48), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(19, 26), match="haven't">
<re.Match object; span=(63, 69), match="didn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="we've">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(60, 67), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(30, 36), match="you'll">
<re.Match object; span=(71, 77), match="you'll">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(73, 80), match="doesn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(65, 70), match="can't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(54, 60), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(59, 63), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(67, 73), match="you'll">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(0, 6), match="hasn't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(53, 59), match="that's">
<re.Match object; span=(36, 41), match="won't">
<re.Match object; span=(11, 17), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(37, 40), match="i'd">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(52, 59), match="doesn't">
<re.Match object; span=(3, 11), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 11), match="shouldn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(41, 46), match="you'd">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(52, 57), match="can't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(69, 74), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(10, 14), match="he's">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(10, 16), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(41, 47), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(5, 8), match="i'd">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(46, 53), match="there's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(47, 53), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 36), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(42, 48), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(53, 61), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(43, 48), match="don't">
<re.Match object; span=(65, 70), match="don't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(15, 21), match="aren't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(63, 70), match="they'll">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(19, 25), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(9, 15), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(32, 39), match="doesn't">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(55, 61), match="you'll">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 18), match="won't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(35, 41), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(10, 17), match="there's">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(41, 47), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(26, 29), match="i'd">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(31, 40), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(65, 72), match="doesn't">
<re.Match object; span=(78, 83), match="won't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(60, 67), match="doesn't">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(44, 52), match="wouldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 44), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(45, 48), match="i'm">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(44, 50), match="you're">
<re.Match object; span=(8, 14), match="aren't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(10, 18), match="couldn't">
<re.Match object; span=(42, 48), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(31, 37), match="you've">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(36, 41), match="can't">
<re.Match object; span=(58, 64), match="that's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(75, 80), match="we're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(68, 74), match="that's">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(16, 22), match="aren't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(75, 79), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(3, 6), match="i'd">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(32, 39), match="doesn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(42, 48), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(85, 90), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(53, 58), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(12, 20), match="couldn't">
<re.Match object; span=(36, 42), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(51, 59), match="wouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(20, 26), match="what's">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(12, 18), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(14, 20), match="you'll">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(13, 20), match="there's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(39, 47), match="wouldn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(10, 13), match="i'm">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(7, 12), match="let's">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(10, 15), match="we're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(75, 82), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(19, 25), match="you'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(3, 9), match="what's">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(8, 11), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(13, 16), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(50, 56), match="that's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(0, 7), match="they'll">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(51, 54), match="i'm">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 43), match="there's">
<re.Match object; span=(24, 30), match="hasn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(86, 93), match="there's">
<re.Match object; span=(122, 129), match="there's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(4, 10), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(59, 66), match="doesn't">
<re.Match object; span=(49, 55), match="didn't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(77, 82), match="she's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(50, 53), match="i'm">
<re.Match object; span=(61, 64), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(45, 50), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(25, 32), match="weren't">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(17, 22), match="who's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(65, 71), match="what's">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(11, 17), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(86, 91), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(63, 67), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(34, 39), match="we've">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(153, 160), match="doesn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(36, 43), match="they're">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(41, 46), match="you'd">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(64, 68), match="he's">
<re.Match object; span=(113, 119), match="didn't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(18, 23), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(24, 31), match="haven't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(23, 28), match="you'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(18, 26), match="wouldn't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(10, 16), match="hasn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(18, 22), match="i'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(48, 54), match="you're">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(63, 69), match="you'll">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(28, 35), match="there's">
<re.Match object; span=(46, 52), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(29, 35), match="aren't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(38, 45), match="there's">
<re.Match object; span=(16, 21), match="let's">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(36, 41), match="we're">
<re.Match object; span=(40, 44), match="he'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(34, 39), match="we're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 20), match="weren't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(22, 29), match="haven't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(65, 70), match="isn't">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(17, 21), match="we'd">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(41, 48), match="haven't">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(12, 18), match="what's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(37, 43), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 24), match="isn't">
<re.Match object; span=(20, 26), match="hasn't">
<re.Match object; span=(13, 19), match="you'll">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(13, 17), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(27, 34), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(60, 66), match="didn't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(30, 33), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(48, 53), match="can't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(17, 21), match="i'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(6, 10), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(21, 29), match="couldn't">
<re.Match object; span=(12, 20), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(54, 58), match="i'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(22, 27), match="let's">
<re.Match object; span=(7, 13), match="you're">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(5, 9), match="i'll">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(54, 60), match="that's">
<re.Match object; span=(28, 33), match="you'd">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(35, 41), match="you'll">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(41, 47), match="hasn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(14, 19), match="you'd">
<re.Match object; span=(53, 58), match="you'd">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(20, 26), match="what's">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(25, 28), match="i'm">
<re.Match object; span=(98, 103), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(42, 48), match="that's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(23, 28), match="won't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(54, 59), match="won't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(18, 24), match="you've">
<re.Match object; span=(3, 11), match="wouldn't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(15, 23), match="wouldn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 17), match="shouldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(83, 88), match="won't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(39, 43), match="i'll">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(49, 55), match="you'll">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(64, 70), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(69, 73), match="i've">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(61, 67), match="you're">
<re.Match object; span=(24, 31), match="weren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(11, 18), match="where's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(32, 38), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(13, 19), match="aren't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(72, 77), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 29), match="he'd">
<re.Match object; span=(52, 58), match="that's">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(47, 50), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(34, 38), match="he's">
<re.Match object; span=(55, 60), match="won't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(48, 52), match="i've">
<re.Match object; span=(67, 70), match="i'm">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(33, 37), match="he'd">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(27, 34), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(34, 40), match="that's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(27, 34), match="they're">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(15, 22), match="they're">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(75, 81), match="you're">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(38, 43), match="won't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(37, 46), match="shouldn't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(49, 57), match="wouldn't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(27, 35), match="couldn't">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(26, 31), match="won't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(44, 49), match="don't">
<re.Match object; span=(71, 76), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(18, 26), match="wouldn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(16, 23), match="haven't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(34, 39), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 31), match="they'd">
<re.Match object; span=(14, 21), match="there's">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(24, 30), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(73, 80), match="doesn't">
<re.Match object; span=(20, 26), match="didn't">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(35, 39), match="i'll">
<re.Match object; span=(62, 68), match="what's">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(12, 17), match="isn't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(68, 76), match="wouldn't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(5, 12), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(17, 23), match="you'll">
<re.Match object; span=(31, 35), match="he's">
<re.Match object; span=(37, 43), match="what's">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(44, 50), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(44, 47), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(68, 73), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(49, 54), match="can't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 35), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'd">
<re.Match object; span=(9, 14), match="we've">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(7, 14), match="weren't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(6, 10), match="i'll">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(1, 4), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(45, 50), match="won't">
<re.Match object; span=(153, 157), match="i've">
<re.Match object; span=(29, 35), match="aren't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 38), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(68, 73), match="can't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(40, 46), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(54, 60), match="what's">
<re.Match object; span=(111, 117), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(48, 51), match="i'm">
<re.Match object; span=(24, 29), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(39, 44), match="isn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(1, 6), match="we're">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(51, 56), match="we're">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(28, 33), match="isn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(6, 12), match="hasn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(28, 34), match="what's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(23, 29), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(63, 67), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(23, 29), match="hasn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(38, 44), match="you've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(25, 30), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(17, 20), match="i'd">
<re.Match object; span=(15, 18), match="i'd">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(25, 32), match="doesn't">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 43), match="there's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(17, 21), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(14, 18), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(44, 53), match="shouldn't">
<re.Match object; span=(71, 77), match="you're">
<re.Match object; span=(90, 93), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(47, 50), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(68, 75), match="haven't">
<re.Match object; span=(21, 26), match="we're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(27, 36), match="shouldn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(37, 45), match="wouldn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(34, 40), match="you'll">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(84, 90), match="that's">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(7, 13), match="you've">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(55, 60), match="won't">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(45, 51), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 14), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(11, 18), match="there's">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(29, 35), match="you've">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(35, 43), match="couldn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(47, 53), match="hasn't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 50), match="they're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 13), match="there's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(122, 129), match="haven't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(38, 42), match="i've">
<re.Match object; span=(43, 50), match="weren't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(42, 45), match="i'm">
<re.Match object; span=(46, 52), match="you'll">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(27, 33), match="you'll">
<re.Match object; span=(10, 16), match="you'll">
<re.Match object; span=(41, 47), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(44, 50), match="what's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(33, 38), match="isn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(33, 40), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(115, 119), match="i've">
<re.Match object; span=(31, 36), match="you'd">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 21), match="i'd">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(71, 77), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(28, 33), match="won't">
<re.Match object; span=(13, 18), match="won't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(20, 24), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(8, 12), match="i've">
<re.Match object; span=(47, 50), match="i'm">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(40, 46), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(62, 65), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(32, 38), match="what's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(46, 53), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(12, 19), match="there's">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(29, 34), match="we've">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(13, 20), match="where's">
<re.Match object; span=(25, 31), match="didn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(49, 55), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(62, 66), match="i've">
<re.Match object; span=(32, 37), match="we've">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(46, 51), match="isn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(79, 85), match="that's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 39), match="you've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(35, 41), match="that's">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="that's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(50, 54), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(26, 32), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 39), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(3, 8), match="won't">
<re.Match object; span=(10, 17), match="they're">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 13), match="they'd">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(73, 77), match="he's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="you've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(46, 53), match="doesn't">
<re.Match object; span=(38, 44), match="didn't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(39, 45), match="you're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(47, 54), match="doesn't">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(41, 47), match="that's">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 32), match="isn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(45, 50), match="can't">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 33), match="i'd">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(22, 28), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 23), match="what're">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(26, 33), match="haven't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(11, 16), match="we've">
<re.Match object; span=(36, 44), match="couldn't">
<re.Match object; span=(1, 5), match="i'll">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(17, 24), match="doesn't">
<re.Match object; span=(8, 12), match="i'll">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(44, 49), match="isn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(33, 42), match="shouldn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(22, 30), match="couldn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(48, 54), match="that's">
<re.Match object; span=(66, 71), match="won't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(17, 23), match="hasn't">
<re.Match object; span=(48, 53), match="can't">
<re.Match object; span=(35, 41), match="what's">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(10, 14), match="i'll">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(42, 46), match="i'll">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(35, 41), match="you're">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(31, 38), match="there's">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(69, 75), match="that's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(12, 18), match="hasn't">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 28), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(29, 34), match="isn't">
<re.Match object; span=(42, 47), match="isn't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(34, 39), match="can't">
<re.Match object; span=(12, 17), match="you'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 32), match="weren't">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="weren't">
<re.Match object; span=(47, 53), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(31, 37), match="you'll">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(117, 124), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(50, 57), match="haven't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(94, 100), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(50, 54), match="i'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(59, 63), match="i'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(71, 77), match="that's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(16, 21), match="we're">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 5), match="she's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(17, 22), match="isn't">
<re.Match object; span=(5, 13), match="wouldn't">
<re.Match object; span=(21, 27), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(32, 38), match="didn't">
<re.Match object; span=(50, 58), match="couldn't">
<re.Match object; span=(15, 19), match="we'd">
<re.Match object; span=(49, 55), match="aren't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(16, 21), match="won't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(25, 28), match="i'd">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(131, 138), match="they've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(4, 10), match="what's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(53, 62), match="shouldn't">
<re.Match object; span=(4, 9), match="isn't">
<re.Match object; span=(28, 33), match="won't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 35), match="they're">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(10, 14), match="i'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(16, 22), match="aren't">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(14, 20), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(7, 12), match="isn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(14, 18), match="i've">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(69, 72), match="i'm">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(64, 70), match="didn't">
<re.Match object; span=(126, 130), match="he's">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(7, 13), match="you're">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(26, 32), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(4, 8), match="he'd">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 42), match="isn't">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(28, 36), match="wouldn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(9, 15), match="aren't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(24, 28), match="i'll">
<re.Match object; span=(4, 10), match="hasn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(43, 49), match="you've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(61, 67), match="didn't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(24, 27), match="i'd">
<re.Match object; span=(47, 52), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(37, 43), match="you're">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(3, 10), match="there's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(8, 13), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(43, 48), match="won't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(32, 37), match="who's">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(21, 26), match="can't">
<re.Match object; span=(94, 99), match="won't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(58, 64), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(32, 38), match="you're">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(2, 9), match="haven't">
<re.Match object; span=(17, 24), match="there's">
<re.Match object; span=(49, 56), match="there's">
<re.Match object; span=(60, 69), match="shouldn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(51, 56), match="you'd">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(24, 29), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 43), match="that's">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(39, 45), match="didn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(103, 110), match="there's">
<re.Match object; span=(122, 125), match="i'm">
<re.Match object; span=(21, 24), match="i'm">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(113, 119), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(18, 21), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(31, 38), match="there's">
<re.Match object; span=(26, 31), match="who's">
<re.Match object; span=(5, 11), match="hasn't">
<re.Match object; span=(21, 25), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="aren't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(15, 21), match="aren't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(12, 17), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(43, 48), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(9, 14), match="who's">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(134, 139), match="he'll">
<re.Match object; span=(437, 442), match="she's">
<re.Match object; span=(18, 27), match="shouldn't">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(5, 11), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 30), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 37), match="there's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(30, 35), match="isn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 36), match="can't">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(28, 35), match="they'll">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(91, 98), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(40, 44), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(27, 30), match="i'd">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(20, 25), match="we've">
<re.Match object; span=(3, 11), match="couldn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(25, 29), match="he's">
<re.Match object; span=(45, 52), match="doesn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(17, 23), match="you'll">
<re.Match object; span=(15, 21), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(26, 33), match="they're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(43, 49), match="aren't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(142, 146), match="i've">
<re.Match object; span=(39, 42), match="i'd">
<re.Match object; span=(10, 16), match="you've">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(57, 63), match="you'll">
<re.Match object; span=(44, 50), match="that's">
<re.Match object; span=(9, 15), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 12), match="i'd">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 15), match="they're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(17, 24), match="they're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 26), match="isn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(87, 93), match="you'll">
<re.Match object; span=(122, 128), match="you'll">
<re.Match object; span=(77, 81), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(11, 18), match="they're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(48, 53), match="don't">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(36, 42), match="you've">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(41, 48), match="they're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(48, 52), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(32, 37), match="won't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(71, 76), match="can't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(62, 65), match="i'm">
<re.Match object; span=(69, 72), match="i'm">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(13, 19), match="that's">
<re.Match object; span=(45, 51), match="didn't">
<re.Match object; span=(39, 48), match="shouldn't">
<re.Match object; span=(4, 10), match="you're">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(20, 28), match="wouldn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(14, 21), match="weren't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(55, 61), match="you're">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(50, 58), match="couldn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(62, 69), match="they're">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(45, 53), match="wouldn't">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(39, 42), match="i'm">
<re.Match object; span=(36, 41), match="can't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(60, 66), match="you'll">
<re.Match object; span=(91, 97), match="you're">
<re.Match object; span=(66, 73), match="doesn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(23, 26), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(14, 20), match="you'll">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 21), match="isn't">
<re.Match object; span=(122, 131), match="shouldn't">
<re.Match object; span=(58, 63), match="you'd">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(26, 32), match="you'll">
<re.Match object; span=(11, 16), match="you'd">
<re.Match object; span=(50, 56), match="that's">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(49, 54), match="don't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(59, 64), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(22, 28), match="you'll">
<re.Match object; span=(3, 8), match="won't">
<re.Match object; span=(70, 76), match="what's">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(49, 52), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(68, 72), match="i've">
<re.Match object; span=(8, 14), match="what's">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(13, 19), match="hasn't">
<re.Match object; span=(3, 8), match="don't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(60, 65), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(31, 36), match="who's">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(42, 49), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(46, 52), match="you're">
<re.Match object; span=(59, 65), match="who'll">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(51, 54), match="i'm">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(29, 32), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(11, 17), match="aren't">
<re.Match object; span=(1, 6), match="won't">
<re.Match object; span=(8, 16), match="couldn't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(29, 36), match="they're">
<re.Match object; span=(45, 52), match="doesn't">
<re.Match object; span=(41, 47), match="you've">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(23, 30), match="there's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(50, 58), match="wouldn't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(9, 15), match="aren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(70, 76), match="you're">
<re.Match object; span=(37, 40), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(20, 28), match="couldn't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(28, 35), match="there's">
<re.Match object; span=(22, 27), match="can't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(12, 18), match="you've">
<re.Match object; span=(87, 93), match="aren't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(33, 39), match="didn't">
<re.Match object; span=(13, 16), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(28, 35), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(27, 31), match="he's">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(50, 56), match="you'll">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(10, 16), match="didn't">
<re.Match object; span=(46, 51), match="don't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(50, 53), match="i'm">
<re.Match object; span=(10, 18), match="wouldn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(24, 29), match="isn't">
<re.Match object; span=(33, 39), match="that's">
<re.Match object; span=(6, 13), match="there's">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(70, 76), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(50, 54), match="i've">
<re.Match object; span=(35, 39), match="he's">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(6, 11), match="can't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(5, 8), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(32, 37), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(45, 50), match="isn't">
<re.Match object; span=(61, 67), match="that's">
<re.Match object; span=(21, 26), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 7), match="they'll">
<re.Match object; span=(36, 42), match="that's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(29, 34), match="isn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(39, 44), match="don't">
<re.Match object; span=(7, 12), match="can't">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(74, 80), match="that's">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(6, 11), match="who's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 17), match="won't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(17, 22), match="can't">
<re.Match object; span=(52, 58), match="that's">
<re.Match object; span=(30, 34), match="i'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(3, 9), match="that's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(67, 74), match="there's">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(7, 13), match="you'll">
<re.Match object; span=(59, 65), match="you'll">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(63, 68), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(40, 47), match="doesn't">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="we're">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(24, 30), match="what's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(48, 51), match="i'm">
<re.Match object; span=(28, 34), match="aren't">
<re.Match object; span=(38, 42), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(32, 39), match="haven't">
<re.Match object; span=(43, 52), match="shouldn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(11, 15), match="i'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(13, 18), match="you'd">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(19, 24), match="won't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(43, 51), match="wouldn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(7, 13), match="what's">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(22, 26), match="i'll">
<re.Match object; span=(11, 20), match="shouldn't">
<re.Match object; span=(49, 55), match="that's">
<re.Match object; span=(30, 38), match="couldn't">
<re.Match object; span=(13, 18), match="can't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(48, 53), match="won't">
<re.Match object; span=(23, 28), match="you'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(86, 94), match="wouldn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(52, 58), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(17, 24), match="they're">
<re.Match object; span=(43, 50), match="they're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(50, 55), match="isn't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(70, 75), match="isn't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(28, 34), match="hadn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(12, 16), match="i'll">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(44, 51), match="doesn't">
<re.Match object; span=(61, 66), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(40, 44), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(33, 39), match="aren't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(33, 40), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(30, 36), match="you've">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(73, 78), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(9, 15), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(24, 29), match="she's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(8, 14), match="you'll">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(8, 13), match="she's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 17), match="that's">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(141, 146), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(68, 73), match="don't">
<re.Match object; span=(20, 23), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(19, 25), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(5, 9), match="i'll">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(25, 31), match="you've">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(74, 79), match="who's">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(3, 8), match="can't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(24, 30), match="you're">
<re.Match object; span=(34, 38), match="i'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(15, 22), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(53, 56), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(64, 70), match="that's">
<re.Match object; span=(105, 110), match="won't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(29, 36), match="doesn't">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(99, 104), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(57, 62), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 29), match="you've">
<re.Match object; span=(49, 55), match="you'll">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(1, 7), match="that's">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(74, 81), match="doesn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(34, 40), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 18), match="isn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(101, 108), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(82, 89), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(21, 25), match="he'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(44, 48), match="i've">
<re.Match object; span=(25, 29), match="i'll">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(51, 59), match="couldn't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(19, 24), match="won't">
<re.Match object; span=(20, 26), match="you've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(24, 31), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(45, 50), match="isn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 24), match="couldn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(27, 30), match="i'm">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 13), match="aren't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(25, 31), match="you're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(14, 17), match="i'd">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(19, 26), match="doesn't">
<re.Match object; span=(46, 53), match="doesn't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(42, 48), match="you'll">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(48, 56), match="couldn't">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(66, 74), match="wouldn't">
<re.Match object; span=(26, 32), match="aren't">
<re.Match object; span=(18, 23), match="she's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(14, 19), match="won't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(7, 12), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(12, 17), match="we've">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(33, 40), match="where's">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(11, 17), match="that's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(65, 71), match="you're">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(54, 58), match="he's">
<re.Match object; span=(31, 36), match="isn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(6, 13), match="doesn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(5, 11), match="what's">
<re.Match object; span=(31, 37), match="what's">
<re.Match object; span=(28, 34), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i'll">
<re.Match object; span=(24, 29), match="can't">
<re.Match object; span=(29, 35), match="didn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(45, 49), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 13), match="i'll">
<re.Match object; span=(15, 23), match="wouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(26, 32), match="you've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(51, 57), match="you've">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(36, 42), match="you've">
<re.Match object; span=(91, 97), match="didn't">
<re.Match object; span=(67, 74), match="doesn't">
<re.Match object; span=(50, 55), match="don't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="isn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 18), match="they're">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(63, 68), match="don't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(19, 25), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(8, 14), match="didn't">
<re.Match object; span=(32, 37), match="don't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(28, 35), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(74, 78), match="i've">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(46, 52), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(18, 24), match="you'll">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(22, 28), match="hasn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="isn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(16, 24), match="couldn't">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(14, 21), match="haven't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(32, 38), match="you'll">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(47, 52), match="won't">
<re.Match object; span=(10, 13), match="i'm">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 33), match="you're">
<re.Match object; span=(19, 25), match="you've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(15, 21), match="you'll">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(45, 51), match="you'll">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(0, 7), match="haven't">
<re.Match object; span=(39, 43), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(58, 63), match="we're">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(17, 23), match="hadn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 26), match="you're">
<re.Match object; span=(64, 70), match="who've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(56, 61), match="isn't">
<re.Match object; span=(34, 41), match="weren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(5, 10), match="who's">
<re.Match object; span=(48, 54), match="that's">
<re.Match object; span=(91, 95), match="i've">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(14, 22), match="couldn't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(40, 47), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(41, 48), match="doesn't">
<re.Match object; span=(118, 123), match="don't">
<re.Match object; span=(91, 98), match="doesn't">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(28, 36), match="wouldn't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(35, 38), match="i'm">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(29, 36), match="haven't">
<re.Match object; span=(145, 151), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(52, 57), match="won't">
<re.Match object; span=(27, 31), match="we'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="he'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(22, 28), match="didn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(26, 32), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(46, 50), match="he's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(6, 12), match="that's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(38, 41), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(11, 17), match="you've">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(14, 18), match="i'll">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(35, 42), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(28, 31), match="i'd">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(23, 26), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 27), match="you've">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(5, 10), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(69, 75), match="you've">
<re.Match object; span=(10, 14), match="i've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(29, 34), match="who's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(30, 36), match="you've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(33, 40), match="they're">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(85, 92), match="they're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(8, 15), match="they're">
<re.Match object; span=(34, 41), match="they're">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(61, 65), match="i've">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(41, 44), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 12), match="there's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(28, 32), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(87, 92), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(40, 46), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="aren't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(28, 35), match="doesn't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 19), match="what's">
<re.Match object; span=(17, 22), match="who's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(26, 32), match="you've">
<re.Match object; span=(0, 7), match="they're">
<re.Match object; span=(18, 25), match="they're">
<re.Match object; span=(23, 28), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(58, 63), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 35), match="can't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 34), match="i'll">
<re.Match object; span=(15, 21), match="that's">
<re.Match object; span=(27, 32), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(52, 56), match="i'll">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(24, 29), match="won't">
<re.Match object; span=(8, 13), match="isn't">
<re.Match object; span=(13, 19), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(31, 34), match="i'd">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(21, 25), match="i've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 21), match="won't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(14, 17), match="i'm">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(47, 52), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(60, 63), match="i'm">
<re.Match object; span=(33, 39), match="hadn't">
<re.Match object; span=(34, 40), match="you've">
<re.Match object; span=(24, 31), match="where's">
<re.Match object; span=(6, 14), match="couldn't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 41), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(34, 40), match="didn't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(1, 7), match="didn't">
<re.Match object; span=(31, 38), match="doesn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(31, 36), match="we've">
<re.Match object; span=(51, 56), match="we've">
<re.Match object; span=(57, 62), match="don't">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(43, 48), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(63, 68), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(27, 33), match="what's">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(25, 31), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(15, 21), match="you're">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(46, 53), match="weren't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(18, 25), match="doesn't">
<re.Match object; span=(44, 47), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i'll">
<re.Match object; span=(23, 27), match="i'll">
<re.Match object; span=(14, 19), match="can't">
<re.Match object; span=(13, 19), match="you'll">
<re.Match object; span=(87, 93), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(31, 34), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(12, 17), match="isn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(22, 28), match="that's">
<re.Match object; span=(74, 80), match="that's">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(8, 15), match="doesn't">
<re.Match object; span=(61, 66), match="don't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(22, 31), match="shouldn't">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(48, 54), match="you'll">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(18, 24), match="that's">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 10), match="i'm">
<re.Match object; span=(65, 71), match="you've">
<re.Match object; span=(38, 42), match="he's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(32, 36), match="i've">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(19, 25), match="didn't">
<re.Match object; span=(6, 14), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(52, 57), match="isn't">
<re.Match object; span=(7, 13), match="that's">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(33, 39), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(14, 20), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(5, 11), match="you've">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(51, 54), match="i'd">
<re.Match object; span=(41, 48), match="doesn't">
<re.Match object; span=(40, 45), match="can't">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(70, 76), match="what's">
<re.Match object; span=(33, 40), match="where's">
<re.Match object; span=(121, 127), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 51), match="wouldn't">
<re.Match object; span=(23, 32), match="shouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(39, 46), match="doesn't">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(21, 29), match="wouldn't">
<re.Match object; span=(79, 84), match="you'd">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(37, 40), match="i'm">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(34, 37), match="i'm">
<re.Match object; span=(96, 101), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(86, 91), match="can't">
<re.Match object; span=(49, 55), match="didn't">
<re.Match object; span=(52, 57), match="won't">
<re.Match object; span=(50, 57), match="where's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(34, 40), match="you're">
<re.Match object; span=(125, 130), match="let's">
<re.Match object; span=(40, 48), match="couldn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(18, 23), match="let's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(44, 49), match="let's">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(28, 34), match="you've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(52, 56), match="i've">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(4, 10), match="didn't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(3, 8), match="isn't">
<re.Match object; span=(39, 44), match="can't">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(23, 28), match="let's">
<re.Match object; span=(37, 42), match="she's">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(103, 107), match="he's">
<re.Match object; span=(9, 14), match="let's">
<re.Match object; span=(8, 13), match="let's">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(77, 81), match="i've">
<re.Match object; span=(50, 55), match="can't">
<re.Match object; span=(11, 18), match="doesn't">
<re.Match object; span=(45, 51), match="you're">
<re.Match object; span=(73, 78), match="won't">
<re.Match object; span=(7, 10), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(12, 18), match="that's">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(45, 49), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(40, 46), match="didn't">
<re.Match object; span=(40, 46), match="that's">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(6, 10), match="i've">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(30, 36), match="you're">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(29, 35), match="you'll">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(12, 18), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(22, 29), match="they'll">
<re.Match object; span=(54, 59), match="isn't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(42, 47), match="don't">
<re.Match object; span=(7, 13), match="hasn't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(33, 39), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(29, 36), match="haven't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(13, 20), match="haven't">
<re.Match object; span=(5, 10), match="won't">
<re.Match object; span=(31, 36), match="don't">
<re.Match object; span=(5, 14), match="shouldn't">
<re.Match object; span=(11, 16), match="isn't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(17, 22), match="can't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(40, 47), match="haven't">
<re.Match object; span=(66, 71), match="don't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(44, 47), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(34, 37), match="i'd">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(63, 68), match="can't">
<re.Match object; span=(89, 93), match="he's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(35, 40), match="don't">
<re.Match object; span=(41, 47), match="you've">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(19, 26), match="there's">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(41, 46), match="isn't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(47, 54), match="there's">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(44, 49), match="you'd">
<re.Match object; span=(60, 65), match="isn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(35, 43), match="couldn't">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(35, 42), match="there's">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(69, 76), match="they've">
<re.Match object; span=(41, 46), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(53, 58), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(34, 38), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(25, 32), match="haven't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="who's">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(10, 15), match="don't">
<re.Match object; span=(43, 49), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="isn't">
<re.Match object; span=(20, 28), match="wouldn't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(26, 29), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(20, 25), match="can't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(4, 13), match="shouldn't">
<re.Match object; span=(34, 43), match="shouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(24, 30), match="that's">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(32, 35), match="i'm">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(7, 13), match="that's">
<re.Match object; span=(30, 36), match="you'll">
<re.Match object; span=(13, 19), match="you're">
<re.Match object; span=(0, 5), match="isn't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(14, 20), match="you've">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(5, 10), match="can't">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(16, 22), match="didn't">
<re.Match object; span=(6, 9), match="i'm">
<re.Match object; span=(28, 34), match="didn't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(37, 41), match="i've">
<re.Match object; span=(18, 22), match="i've">
<re.Match object; span=(25, 33), match="couldn't">
<re.Match object; span=(38, 44), match="you'll">
<re.Match object; span=(28, 33), match="who's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(50, 57), match="they're">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 14), match="haven't">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 21), match="couldn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(12, 18), match="aren't">
<re.Match object; span=(17, 23), match="you've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(38, 43), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(36, 40), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(33, 40), match="there's">
<re.Match object; span=(136, 139), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(74, 79), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(29, 36), match="there's">
<re.Match object; span=(4, 11), match="haven't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(19, 22), match="i'm">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(33, 39), match="you'll">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(25, 32), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(12, 19), match="there's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(47, 53), match="aren't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(23, 29), match="you're">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(17, 24), match="there's">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(29, 36), match="there's">
<re.Match object; span=(26, 33), match="there's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(51, 55), match="i've">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(0, 6), match="that's">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(37, 43), match="you'll">
<re.Match object; span=(87, 93), match="you'll">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(22, 27), match="we've">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(30, 37), match="there's">
<re.Match object; span=(46, 53), match="they're">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(57, 63), match="that's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(9, 16), match="doesn't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(0, 8), match="wouldn't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(29, 35), match="you're">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(27, 33), match="you'll">
<re.Match object; span=(9, 13), match="i've">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(16, 23), match="they've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 14), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(52, 57), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(39, 44), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(57, 61), match="i've">
<re.Match object; span=(27, 32), match="won't">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(11, 17), match="aren't">
<re.Match object; span=(117, 122), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 18), match="there's">
<re.Match object; span=(13, 17), match="i've">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(84, 90), match="you'll">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(8, 11), match="i'm">
<re.Match object; span=(36, 45), match="shouldn't">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(12, 21), match="shouldn't">
<re.Match object; span=(19, 23), match="i'll">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(42, 46), match="i've">
<re.Match object; span=(57, 64), match="doesn't">
<re.Match object; span=(12, 17), match="don't">
<re.Match object; span=(95, 103), match="wouldn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(126, 132), match="didn't">
<re.Match object; span=(43, 49), match="you'll">
<re.Match object; span=(65, 71), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(19, 25), match="you're">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(13, 16), match="i'm">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(64, 70), match="didn't">
<re.Match object; span=(20, 26), match="didn't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(13, 19), match="didn't">
<re.Match object; span=(17, 21), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(15, 21), match="didn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(5, 13), match="wouldn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(47, 53), match="that's">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(26, 32), match="what's">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(32, 35), match="i'm">
<re.Match object; span=(41, 45), match="i've">
<re.Match object; span=(42, 45), match="i'm">
<re.Match object; span=(64, 67), match="i'm">
<re.Match object; span=(90, 93), match="i'm">
<re.Match object; span=(37, 44), match="what'll">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(4, 9), match="won't">
<re.Match object; span=(57, 66), match="shouldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(24, 29), match="don't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(18, 25), match="there's">
<re.Match object; span=(41, 48), match="there's">
<re.Match object; span=(44, 49), match="isn't">
<re.Match object; span=(52, 59), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 28), match="i'm">
<re.Match object; span=(19, 26), match="haven't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(17, 22), match="don't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(52, 57), match="you'd">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(9, 14), match="won't">
<re.Match object; span=(28, 36), match="couldn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 31), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(66, 71), match="isn't">
<re.Match object; span=(53, 59), match="didn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(0, 5), match="we've">
<re.Match object; span=(19, 24), match="we've">
<re.Match object; span=(31, 36), match="we've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(18, 23), match="can't">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(48, 54), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(4, 10), match="that's">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(5, 12), match="there's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(17, 23), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(58, 63), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="won't">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(32, 35), match="i'd">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(51, 57), match="you've">
<re.Match object; span=(8, 14), match="you're">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(25, 31), match="you'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(38, 41), match="i'm">
<re.Match object; span=(54, 58), match="i've">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(22, 27), match="isn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(35, 41), match="aren't">
<re.Match object; span=(36, 42), match="you'll">
<re.Match object; span=(24, 30), match="you'll">
<re.Match object; span=(35, 39), match="i've">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 27), match="that's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(89, 96), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(8, 13), match="won't">
<re.Match object; span=(11, 14), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(22, 28), match="you've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(14, 17), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(14, 21), match="doesn't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(40, 45), match="won't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(5, 12), match="doesn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(23, 29), match="that's">
<re.Match object; span=(7, 13), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(11, 17), match="you're">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(74, 80), match="you're">
<re.Match object; span=(3, 10), match="doesn't">
<re.Match object; span=(16, 20), match="i've">
<re.Match object; span=(114, 118), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(12, 19), match="doesn't">
<re.Match object; span=(38, 43), match="isn't">
<re.Match object; span=(35, 40), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 49), match="didn't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(49, 53), match="i've">
<re.Match object; span=(2, 10), match="wouldn't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(0, 6), match="you've">
<re.Match object; span=(33, 38), match="can't">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(58, 65), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(55, 62), match="doesn't">
<re.Match object; span=(15, 20), match="we're">
<re.Match object; span=(40, 45), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(28, 34), match="that's">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(24, 27), match="i'd">
<re.Match object; span=(5, 13), match="couldn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(20, 25), match="isn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 19), match="isn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(46, 51), match="won't">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(23, 29), match="didn't">
<re.Match object; span=(19, 27), match="couldn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(8, 13), match="can't">
<re.Match object; span=(57, 63), match="you'll">
<re.Match object; span=(77, 83), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(30, 33), match="i'm">
<re.Match object; span=(21, 25), match="i'll">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(16, 22), match="you'll">
<re.Match object; span=(9, 14), match="don't">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(42, 49), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(20, 23), match="i'd">
<re.Match object; span=(21, 28), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(32, 38), match="that's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 8), match="didn't">
<re.Match object; span=(34, 39), match="don't">
<re.Match object; span=(50, 57), match="doesn't">
<re.Match object; span=(4, 11), match="haven't">
<re.Match object; span=(27, 33), match="you've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(14, 20), match="didn't">
<re.Match object; span=(8, 13), match="don't">
<re.Match object; span=(11, 16), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(36, 39), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(10, 15), match="can't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(46, 51), match="can't">
<re.Match object; span=(39, 45), match="that's">
<re.Match object; span=(16, 22), match="you're">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(27, 33), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(18, 25), match="there's">
<re.Match object; span=(89, 93), match="he's">
<re.Match object; span=(3, 8), match="we're">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(21, 27), match="you'll">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(47, 51), match="i've">
<re.Match object; span=(26, 31), match="can't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(62, 66), match="i've">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(3, 9), match="you've">
<re.Match object; span=(6, 11), match="don't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(67, 72), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(33, 39), match="you'll">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(24, 28), match="he's">
<re.Match object; span=(43, 48), match="she's">
<re.Match object; span=(110, 117), match="they're">
<re.Match object; span=(9, 12), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(26, 30), match="i've">
<re.Match object; span=(51, 56), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(37, 44), match="doesn't">
<re.Match object; span=(3, 7), match="he's">
<re.Match object; span=(22, 26), match="he's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(21, 27), match="what's">
<re.Match object; span=(0, 8), match="couldn't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(55, 62), match="they're">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(19, 23), match="he's">
<re.Match object; span=(30, 36), match="that's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 4), match="he's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(8, 13), match="she's">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(4, 11), match="doesn't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(20, 26), match="that's">
<re.Match object; span=(45, 50), match="don't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(8, 15), match="where's">
<re.Match object; span=(38, 43), match="don't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(14, 20), match="that's">
<re.Match object; span=(19, 23), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(56, 61), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(0, 7), match="they've">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(9, 17), match="couldn't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(34, 41), match="there's">
<re.Match object; span=(26, 31), match="don't">
<re.Match object; span=(20, 25), match="we're">
<re.Match object; span=(38, 45), match="doesn't">
<re.Match object; span=(53, 59), match="didn't">
<re.Match object; span=(37, 43), match="didn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(73, 76), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(5, 12), match="weren't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(5, 8), match="i'm">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(0, 5), match="can't">
<re.Match object; span=(38, 44), match="aren't">
<re.Match object; span=(16, 19), match="i'm">
<re.Match object; span=(0, 3), match="i'd">
<re.Match object; span=(15, 20), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(17, 23), match="didn't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(43, 50), match="doesn't">
<re.Match object; span=(77, 82), match="isn't">
<re.Match object; span=(9, 15), match="didn't">
<re.Match object; span=(37, 40), match="i'd">
<re.Match object; span=(36, 43), match="doesn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(37, 42), match="can't">
<re.Match object; span=(28, 31), match="i'm">
<re.Match object; span=(148, 152), match="i'll">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(22, 29), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(30, 37), match="doesn't">
<re.Match object; span=(14, 19), match="don't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(8, 13), match="we're">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="you'll">
<re.Match object; span=(26, 33), match="doesn't">
<re.Match object; span=(35, 39), match="i'll">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(10, 17), match="doesn't">
<re.Match object; span=(7, 13), match="aren't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(42, 45), match="i'd">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(19, 24), match="don't">
<re.Match object; span=(51, 58), match="doesn't">
<re.Match object; span=(17, 20), match="i'm">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="let's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(13, 19), match="you've">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(7, 14), match="there's">
<re.Match object; span=(22, 27), match="won't">
<re.Match object; span=(7, 14), match="doesn't">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(23, 31), match="couldn't">
<re.Match object; span=(8, 16), match="couldn't">
<re.Match object; span=(23, 29), match="you'll">
<re.Match object; span=(55, 60), match="don't">
<re.Match object; span=(16, 21), match="don't">
<re.Match object; span=(28, 33), match="can't">
<re.Match object; span=(6, 11), match="isn't">
<re.Match object; span=(46, 53), match="there's">
<re.Match object; span=(11, 17), match="didn't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(22, 26), match="i've">
<re.Match object; span=(54, 59), match="won't">
<re.Match object; span=(60, 65), match="don't">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(51, 57), match="didn't">
<re.Match object; span=(98, 104), match="hasn't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(43, 49), match="hadn't">
<re.Match object; span=(16, 19), match="i'd">
<re.Match object; span=(29, 35), match="didn't">
<re.Match object; span=(16, 22), match="that's">
<re.Match object; span=(30, 36), match="didn't">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(15, 18), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(20, 27), match="there's">
<re.Match object; span=(38, 44), match="that's">
<re.Match object; span=(18, 23), match="isn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(26, 35), match="shouldn't">
<re.Match object; span=(5, 10), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(28, 33), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(60, 66), match="didn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(31, 37), match="that's">
<re.Match object; span=(12, 18), match="hadn't">
<re.Match object; span=(11, 14), match="i'd">
<re.Match object; span=(29, 34), match="don't">
<re.Match object; span=(0, 5), match="you'd">
<re.Match object; span=(8, 14), match="hadn't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(48, 53), match="isn't">
<re.Match object; span=(5, 10), match="isn't">
<re.Match object; span=(0, 6), match="they'd">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(27, 32), match="don't">
<re.Match object; span=(11, 16), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(4, 8), match="i've">
<re.Match object; span=(21, 27), match="you're">
<re.Match object; span=(54, 61), match="doesn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(29, 34), match="can't">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(25, 30), match="don't">
<re.Match object; span=(13, 20), match="doesn't">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(22, 27), match="don't">
<re.Match object; span=(34, 41), match="doesn't">
<re.Match object; span=(6, 11), match="we're">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(29, 33), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(12, 16), match="i've">
<re.Match object; span=(6, 12), match="didn't">
<re.Match object; span=(48, 54), match="hasn't">
<re.Match object; span=(10, 16), match="you're">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(0, 9), match="shouldn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 4), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(3, 9), match="you're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(33, 38), match="don't">
<re.Match object; span=(4, 7), match="i'm">
<re.Match object; span=(23, 28), match="isn't">
<re.Match object; span=(17, 23), match="that's">
<re.Match object; span=(4, 9), match="don't">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(36, 41), match="isn't">
<re.Match object; span=(21, 26), match="don't">
<re.Match object; span=(42, 47), match="can't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(10, 15), match="let's">
<re.Match object; span=(26, 31), match="isn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(36, 41), match="don't">
<re.Match object; span=(0, 5), match="who's">
<re.Match object; span=(26, 34), match="couldn't">
<re.Match object; span=(29, 35), match="that's">
<re.Match object; span=(15, 20), match="don't">
<re.Match object; span=(49, 56), match="doesn't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(70, 77), match="doesn't">
<re.Match object; span=(10, 16), match="that's">
<re.Match object; span=(3, 9), match="didn't">
<re.Match object; span=(55, 59), match="i've">
<re.Match object; span=(25, 29), match="i've">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(2, 10), match="couldn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(9, 14), match="isn't">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(20, 26), match="you'll">
<re.Match object; span=(12, 18), match="didn't">
<re.Match object; span=(56, 62), match="didn't">
<re.Match object; span=(69, 72), match="i'd">
<re.Match object; span=(18, 23), match="don't">
<re.Match object; span=(23, 27), match="he's">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(15, 21), match="what's">
<re.Match object; span=(0, 4), match="i'll">
<re.Match object; span=(30, 34), match="i've">
<re.Match object; span=(11, 15), match="i've">
<re.Match object; span=(8, 15), match="there's">
<re.Match object; span=(31, 35), match="i've">
<re.Match object; span=(17, 25), match="couldn't">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(12, 15), match="i'm">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(20, 24), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(16, 21), match="can't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(6, 13), match="where's">
<re.Match object; span=(9, 14), match="you'd">
<re.Match object; span=(15, 19), match="i've">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(13, 18), match="don't">
<re.Match object; span=(0, 7), match="doesn't">
<re.Match object; span=(24, 31), match="there's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(40, 45), match="don't">
<re.Match object; span=(18, 24), match="you're">
<re.Match object; span=(20, 25), match="don't">
<re.Match object; span=(15, 20), match="can't">
<re.Match object; span=(4, 9), match="can't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(43, 47), match="i've">
<re.Match object; span=(19, 24), match="can't">
<re.Match object; span=(7, 12), match="don't">
<re.Match object; span=(28, 34), match="you'll">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(24, 28), match="i've">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(0, 3), match="i'm">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(19, 25), match="that's">
<re.Match object; span=(2, 7), match="can't">
<re.Match object; span=(12, 16), match="he's">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(0, 6), match="didn't">
<re.Match object; span=(32, 37), match="can't">
<re.Match object; span=(1, 6), match="we're">
<re.Match object; span=(5, 11), match="didn't">
<re.Match object; span=(80, 86), match="that's">
<re.Match object; span=(72, 77), match="isn't">
<re.Match object; span=(0, 5), match="we're">
<re.Match object; span=(20, 27), match="doesn't">
<re.Match object; span=(72, 78), match="didn't">
<re.Match object; span=(37, 42), match="don't">
<re.Match object; span=(4, 11), match="where's">
<re.Match object; span=(2, 7), match="won't">
<re.Match object; span=(24, 30), match="you've">
<re.Match object; span=(64, 70), match="that's">
<re.Match object; span=(33, 37), match="i've">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 6), match="what's">
<re.Match object; span=(128, 132), match="he's">
<re.Match object; span=(16, 23), match="doesn't">
<re.Match object; span=(47, 52), match="don't">
<re.Match object; span=(0, 6), match="you're">
<re.Match object; span=(7, 12), match="let's">
<re.Match object; span=(71, 77), match="didn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 27), match="i've">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(18, 24), match="didn't">
<re.Match object; span=(23, 28), match="can't">
<re.Match object; span=(45, 51), match="you've">
<re.Match object; span=(30, 35), match="don't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(0, 5), match="don't">
<re.Match object; span=(59, 66), match="doesn't">
<re.Match object; span=(2, 7), match="don't">
<re.Match object; span=(23, 30), match="doesn't">
<re.Match object; span=(0, 7), match="where's">
<re.Match object; span=(8, 14), match="that's">
<re.Match object; span=(44, 52), match="couldn't">
<re.Match object; span=(36, 42), match="didn't">
<re.Match object; span=(31, 34), match="i'm">
<re.Match object; span=(63, 70), match="doesn't">
<re.Match object; span=(50, 56), match="didn't">
<re.Match object; span=(0, 7), match="there's">
<re.Match object; span=(31, 37), match="didn't">
<re.Match object; span=(35, 40), match="can't">
<re.Match object; span=(0, 3), match="i'm">
expand_contractions("doesn't react at all", contractions)
'does not react at all'
movie_reviews.head(1)['plot_synopsis'].item()
"jack ryan ford is on a working vacation in london with his family. he has retired from the cia and is a professor at the us naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally tries to get the guard to react by doing an improvised tap dance in front of him. she is impressed when the guard, trained to ignore distraction, does not react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorists kevin o'donnell, the driver, as well as sean miller sean bean and his younger brother patrick. the three are loading bullets into their guns as they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. as they finish loading their weapons, another terrorist, annette, radios to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaves the palace, the terrorists' car falls in line behind him. they follow the car, headed for the ambush point. as they drive, o'donnell and the miller brothers don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pulls out and swerves in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerves to a stop behind the car. the millers and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan sees what is going on and hastily gets his family to cover just as the bomb is detonated. moments later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullets does not kill lord holmes, who is currently using his body to shield his family. sean marches over to the rear door and orders lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately breaks cover, runs up behind miller, and disarms him. as ryan grabs sean's pistol, he gets shot in the left shoulder by another terrorist. he shoots one of the other terrorists in the shoulder. patrick attempts to flee, but ryan fatally shoots him in the chest. seeing two of the royal guards racing towards the scene, the remaining terrorists quickly get back in their cars and drive off, leaving sean to be captured when the authorities arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller as a witness. subsequently, miller is convicted on all charges and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually returns to the united states.on the possibility that miller's ula comrades might try to liberate him, the authorities take no chances. to that extent, they set up decoy convoys on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula as to which convoy is carrying miller. miller's motorcade is in transit when it comes to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moments after the convoy comes to a stop, miller's comrades attack, using rocket launchers to blow up the cars at the front and rear of the convoy. they then march up to the prison van and order the guards out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guards hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hands miller a pistol, with which he coldly executes both officers and the bridge operator by shooting them in the backs of their heads. they then depart the scene.miller and his companions flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several members of his entourage to accompany him to the united states on a short mission targeting ryan and his family.miller travels to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united states naval academy to ambush ryan as he is leaving work. ryan notices clark idling nearby. clark casually walks away, but ryan clearly unnerved, then gets even more nervous as he hears a car engine starting. his suspicions are well warrantedas he continues walking down the street, he looks in the reflection of a parked van's rearview door and sees that clark, and the stolen car driven by annette, are following him. ryan pretends not to see them until he is walking past a parked panel truck, at which point he ducks behind the truck. clark draws a silenced pistol and prepares to corner ryan, but ryan attacks him first. clark quickly gets the upper hand and throws ryan to the curb. he grabs his pistol and is about to shoot ryan when he is shot dead himself by a us marine sentry. annette quickly drives away in her jeep. ryan, realizing that miller is going to target cathy and sally, jumps into his car and drives off.simultaneously, miller and the other henchman follow cathy as she picks up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to us highway 50. ryan tries to frantically call cathy on his car phone, but she is busy answering phone calls from the hospital. as he is trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually gets through to cathy and tells her to get off the highway and find the nearest police station. unfortunately, just as the call ends, miller's van comes charging up on her using the left shoulder. miller moves to the backseat of the van, grabs a submachine gun, and slides open the door. another car momentarily delays him, but once he has a clear aim, miller opens fire, spraying cathy's car with bullets. a multicar pileup ensues as cathy loses control of her car and crashes headon into a concrete barrier. ryan, who has found himself stuck in traffic headed in the other direction, sees the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work leads him to conclude that miller has taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan makes his recommendations to his superiors at the cia. greer invites jack to a live feed of an sas strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companions have already fled the camp and are on their way to the us to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group tracks holmes to this location, kills the diplomatic security service agents and maryland state troopers guarding the house, and attempts once more to kidnap lord holmes. ryan leads holmes and his family to safety while he attempts to lure miller and his companions away from his home.the fbi hostage rescue teams are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companions try to persuade miller to turn around, but an enraged and deranged miller kills his terrorist companions and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tells jack and sally."
Experiments with lemmatization¶
import nltk
nltk.download('wordnet')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
[nltk_data] Downloading package wordnet to /root/nltk_data... [nltk_data] Package wordnet is already up-to-date! [nltk_data] Downloading package punkt to /root/nltk_data... [nltk_data] Package punkt is already up-to-date! [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] /root/nltk_data... [nltk_data] Package averaged_perceptron_tagger is already up-to- [nltk_data] date!
True
from nltk.stem import WordNetLemmatizer
text = movie_reviews.head(1)['plot_synopsis'].item()
lemmatizer = WordNetLemmatizer()
text_l1 = ' '.join([lemmatizer.lemmatize(word) for word in text.split()])
print(text_l1)
jack ryan ford is on a working vacation in london with his family. he ha retired from the cia and is a professor at the u naval academy. he is seen delivering a lecture at the royal naval academy in london.meanwhile, ryan's wife cathy and daughter sally are sightseeing near buckingham palace. sally and cathy come upon a british royal guard, and sally try to get the guard to react by doing an improvised tap dance in front of him. she is impressed when the guard, trained to ignore distraction, doe not react at all, and they leave.as sally and cathy walk away from the guard, en route to rendezvous with ryan, they walk by a stolen cab, in which sit three ulster liberation army terrorist kevin o'donnell, the driver, a well a sean miller sean bean and his younger brother patrick. the three are loading bullet into their gun a they prepare to carry out a scheduled ambush on lord william holmes, british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother. a they finish loading their weapons, another terrorist, annette, radio to them that lord holmes is leaving the palace with his wife and son.as lord holmes's car leaf the palace, the terrorists' car fall in line behind him. they follow the car, headed for the ambush point. a they drive, o'donnell and the miller brother don ski masks.meanwhile, ryan is about to cross the street to meet with his wife and daughter. he arrives at the same time that lord holmes's car is passing through the area.suddenly, another cab, parked on the side of the road, driven by annette, pull out and swerve in front of lord holmes's vehicle. simultaneously, o'donnell's vehicle swerve to a stop behind the car. the miller and another accomplice jump out and quickly plant a bomb underneath the chauffeured car's engine. ryan see what is going on and hastily get his family to cover just a the bomb is detonated. moment later, sean and another accomplice begin shooting into the car, riddling lord holmes's driver and bodyguard with bullet. the onslaught of bullet doe not kill lord holmes, who is currently using his body to shield his family. sean march over to the rear door and order lord holmes out of the car at gunpoint. ryan, unable to stand being just a bystander, immediately break cover, run up behind miller, and disarms him. a ryan grab sean's pistol, he get shot in the left shoulder by another terrorist. he shoot one of the other terrorist in the shoulder. patrick attempt to flee, but ryan fatally shoot him in the chest. seeing two of the royal guard racing towards the scene, the remaining terrorist quickly get back in their car and drive off, leaving sean to be captured when the authority arrive. they retreat back to a warehouse, where they change out license plates.while recovering, ryan is called to testify against miller a a witness. subsequently, miller is convicted on all charge and sentenced to life in prison. ryan is awarded the order of knight commander of the victorian order, and eventually return to the united states.on the possibility that miller's ula comrade might try to liberate him, the authority take no chances. to that extent, they set up decoy convoy on the day that miller is to be transferred to albany prison on the isle of wight. however, someone manages to tip off the ula a to which convoy is carrying miller. miller's motorcade is in transit when it come to a stop at a drawbridge. the raised bridge, however, is a trap meant to block them in. moment after the convoy come to a stop, miller's comrade attack, using rocket launcher to blow up the car at the front and rear of the convoy. they then march up to the prison van and order the guard out at gunpoint by taking the bridge operator hostage and threatening to shoot him. one of the guard hesitates and is shot fatally. the other officer and the inspector are pulled out, thrown on the ground and forced to lie on their stomachs. o'donnell hand miller a pistol, with which he coldly executes both officer and the bridge operator by shooting them in the back of their heads. they then depart the scene.miller and his companion flee on a cargo ship from britain to libya, to prepare for their next kidnapping attempt on lord holmes. miller however, cannot shake his anger towards ryan for killing his younger brother and persuades several member of his entourage to accompany him to the united state on a short mission targeting ryan and his family.miller travel to the united states, accompanied by annette, comrade ned clark, and a fourth henchman. annette and clark travel to the united state naval academy to ambush ryan a he is leaving work. ryan notice clark idling nearby. clark casually walk away, but ryan clearly unnerved, then get even more nervous a he hears a car engine starting. his suspicion are well warrantedas he continues walking down the street, he look in the reflection of a parked van's rearview door and see that clark, and the stolen car driven by annette, are following him. ryan pretend not to see them until he is walking past a parked panel truck, at which point he duck behind the truck. clark draw a silenced pistol and prepares to corner ryan, but ryan attack him first. clark quickly get the upper hand and throw ryan to the curb. he grab his pistol and is about to shoot ryan when he is shot dead himself by a u marine sentry. annette quickly drive away in her jeep. ryan, realizing that miller is going to target cathy and sally, jump into his car and drive off.simultaneously, miller and the other henchman follow cathy a she pick up sally from school, with miller riding shotgun. they prepare to follow the two, but are temporarily held up by a crossing guard. once the crossing guard clears, they pursue her to u highway 50. ryan try to frantically call cathy on his car phone, but she is busy answering phone call from the hospital. a he is trying to connect with her, miller and his henchman recklessly weave in and out of traffic to catch up with cathy. ryan eventually get through to cathy and tell her to get off the highway and find the nearest police station. unfortunately, just a the call ends, miller's van come charging up on her using the left shoulder. miller move to the backseat of the van, grab a submachine gun, and slide open the door. another car momentarily delay him, but once he ha a clear aim, miller open fire, spraying cathy's car with bullets. a multicar pileup ensues a cathy loses control of her car and crash headon into a concrete barrier. ryan, who ha found himself stuck in traffic headed in the other direction, see the smoke from the crash and realizes the worst.cathy and sally are airlifted to a nearby hospital for treatment. enraged over the nearloss of his family, ryan decides to go back to work for the cia, having earlier rejected the appeal of his former superior, vice admiral james greer james earl jones.ryan's tireless work lead him to conclude that miller ha taken refuge in a training camp, one of many located in libya. he also determines that a recent new arrival at the camp may be one of o'donnell's english informants, dennis cooley. ryan make his recommendation to his superior at the cia. greer invite jack to a live feed of an sa strike team attack on the camp. everyone in the camp is eliminated but unbeknownst to the cia and ryan, miller and his companion have already fled the camp and are on their way to the u to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo. with the aid of lord holmes' traitorous assistant, miller's group track holmes to this location, kill the diplomatic security service agent and maryland state trooper guarding the house, and attempt once more to kidnap lord holmes. ryan lead holmes and his family to safety while he attempt to lure miller and his companion away from his home.the fbi hostage rescue team are scrambled to pick up holmes. upon realising that ryan is leading them away from holmes, miller's companion try to persuade miller to turn around, but an enraged and deranged miller kill his terrorist companion and continues his pursuit of ryan. ryan and miller fight hand to hand miller is killed when ryan impales him backward on a boat anchor, and his body is obliterated in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she is going to have, and before she tell jack and sally.
from nltk.corpus import wordnet
def get_wordnet_pos(word):
"""Map POS tag to first character lemmatize() accepts"""
tag = nltk.pos_tag([word])[0][1][0].upper()
tag_dict = {"J": wordnet.ADJ,
"N": wordnet.NOUN,
"V": wordnet.VERB,
"R": wordnet.ADV}
return tag_dict.get(tag, wordnet.NOUN)
text_l2 = ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(text)])
print(text_l2)
jack ryan ford be on a work vacation in london with his family . he have retire from the cia and be a professor at the u naval academy . he be see deliver a lecture at the royal naval academy in london.meanwhile , ryan 's wife cathy and daughter sally be sightsee near buckingham palace . sally and cathy come upon a british royal guard , and sally try to get the guard to react by do an improvised tap dance in front of him . she be impressed when the guard , train to ignore distraction , do not react at all , and they leave.as sally and cathy walk away from the guard , en route to rendezvous with ryan , they walk by a steal cab , in which sit three ulster liberation army terrorist kevin o'donnell , the driver , a well a sean miller sean bean and his young brother patrick . the three be load bullet into their gun a they prepare to carry out a schedule ambush on lord william holmes , british secretary of state for northern ireland and a distant member of the british royal family the cousin of the queen mother . a they finish load their weapon , another terrorist , annette , radio to them that lord holmes be leave the palace with his wife and son.as lord holmes 's car leaf the palace , the terrorist ' car fall in line behind him . they follow the car , head for the ambush point . a they drive , o'donnell and the miller brother don ski masks.meanwhile , ryan be about to cross the street to meet with his wife and daughter . he arrives at the same time that lord holmes 's car be passing through the area.suddenly , another cab , park on the side of the road , driven by annette , pull out and swerve in front of lord holmes 's vehicle . simultaneously , o'donnell 's vehicle swerve to a stop behind the car . the miller and another accomplice jump out and quickly plant a bomb underneath the chauffeur car 's engine . ryan see what be go on and hastily get his family to cover just a the bomb be detonate . moment later , sean and another accomplice begin shoot into the car , riddle lord holmes 's driver and bodyguard with bullet . the onslaught of bullet do not kill lord holmes , who be currently use his body to shield his family . sean march over to the rear door and order lord holmes out of the car at gunpoint . ryan , unable to stand be just a bystander , immediately break cover , run up behind miller , and disarms him . a ryan grab sean 's pistol , he get shot in the left shoulder by another terrorist . he shoot one of the other terrorist in the shoulder . patrick attempt to flee , but ryan fatally shoot him in the chest . see two of the royal guard race towards the scene , the remain terrorist quickly get back in their car and drive off , leave sean to be capture when the authority arrive . they retreat back to a warehouse , where they change out license plates.while recover , ryan be call to testify against miller a a witness . subsequently , miller be convict on all charge and sentence to life in prison . ryan be award the order of knight commander of the victorian order , and eventually return to the united states.on the possibility that miller 's ula comrade might try to liberate him , the authority take no chance . to that extent , they set up decoy convoy on the day that miller be to be transfer to albany prison on the isle of wight . however , someone manages to tip off the ula a to which convoy be carry miller . miller 's motorcade be in transit when it come to a stop at a drawbridge . the raise bridge , however , be a trap meant to block them in . moment after the convoy come to a stop , miller 's comrade attack , use rocket launcher to blow up the car at the front and rear of the convoy . they then march up to the prison van and order the guard out at gunpoint by take the bridge operator hostage and threaten to shoot him . one of the guard hesitates and be shot fatally . the other officer and the inspector be pull out , thrown on the ground and force to lie on their stomach . o'donnell hand miller a pistol , with which he coldly executes both officer and the bridge operator by shoot them in the back of their head . they then depart the scene.miller and his companion flee on a cargo ship from britain to libya , to prepare for their next kidnapping attempt on lord holmes . miller however , can not shake his anger towards ryan for kill his young brother and persuades several member of his entourage to accompany him to the united state on a short mission target ryan and his family.miller travel to the united state , accompany by annette , comrade ned clark , and a fourth henchman . annette and clark travel to the united state naval academy to ambush ryan a he be leave work . ryan notice clark idle nearby . clark casually walk away , but ryan clearly unnerved , then get even more nervous a he hears a car engine start . his suspicion be well warrantedas he continue walk down the street , he look in the reflection of a park van 's rearview door and see that clark , and the steal car driven by annette , be follow him . ryan pretend not to see them until he be walk past a park panel truck , at which point he duck behind the truck . clark draw a silence pistol and prepares to corner ryan , but ryan attack him first . clark quickly get the upper hand and throw ryan to the curb . he grab his pistol and be about to shoot ryan when he be shot dead himself by a u marine sentry . annette quickly drive away in her jeep . ryan , realize that miller be go to target cathy and sally , jump into his car and drive off.simultaneously , miller and the other henchman follow cathy a she pick up sally from school , with miller rid shotgun . they prepare to follow the two , but be temporarily held up by a cross guard . once the cross guard clear , they pursue her to u highway 50. ryan try to frantically call cathy on his car phone , but she be busy answer phone call from the hospital . a he be try to connect with her , miller and his henchman recklessly weave in and out of traffic to catch up with cathy . ryan eventually get through to cathy and tell her to get off the highway and find the near police station . unfortunately , just a the call end , miller 's van come charge up on her use the left shoulder . miller move to the backseat of the van , grab a submachine gun , and slide open the door . another car momentarily delay him , but once he have a clear aim , miller open fire , spray cathy 's car with bullet . a multicar pileup ensues a cathy loses control of her car and crash headon into a concrete barrier . ryan , who have found himself stuck in traffic head in the other direction , see the smoke from the crash and realizes the worst.cathy and sally be airlift to a nearby hospital for treatment . enrage over the nearloss of his family , ryan decides to go back to work for the cia , have earlier reject the appeal of his former superior , vice admiral james greer james earl jones.ryan 's tireless work lead him to conclude that miller have take refuge in a training camp , one of many locate in libya . he also determines that a recent new arrival at the camp may be one of o'donnell 's english informant , dennis cooley . ryan make his recommendation to his superior at the cia . greer invite jack to a live feed of an sa strike team attack on the camp . everyone in the camp be eliminate but unbeknownst to the cia and ryan , miller and his companion have already flee the camp and be on their way to the u to stage their next attack on lord holmes.lord holmes decides to visit ryan at his home to formally present his kcvo . with the aid of lord holmes ' traitorous assistant , miller 's group track holmes to this location , kill the diplomatic security service agent and maryland state trooper guard the house , and attempt once more to kidnap lord holmes . ryan lead holmes and his family to safety while he attempt to lure miller and his companion away from his home.the fbi hostage rescue team be scramble to pick up holmes . upon realise that ryan be lead them away from holmes , miller 's companion try to persuade miller to turn around , but an enrage and derange miller kill his terrorist companion and continue his pursuit of ryan . ryan and miller fight hand to hand miller be kill when ryan impales him backward on a boat anchor , and his body be obliterate in the subsequent explosion of the craft.credits roll just after caroline ryan learns the gender of the child she be go to have , and before she tell jack and sally .
movie_reviews['review_text'] = movie_reviews['review_text'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['plot_synopsis'] = movie_reviews['plot_synopsis'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['plot_summary'] = movie_reviews['plot_summary'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
movie_reviews['review_summary'] = movie_reviews['review_summary'].map(lambda x: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(x)]))
from transformers import AutoTokenizer
# Specify the model name
model_name = "bert-base-uncased"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
print(len(tokenizer.tokenize(text, add_special_tokens=False, truncation=False)))
print(len(tokenizer.tokenize(text_l1, add_special_tokens=False, truncation=False)))
print(len(tokenizer.tokenize(text_l2, add_special_tokens=False, truncation=False)))
1797 1794 1791
print(len(set(tokenizer.tokenize(text, add_special_tokens=False, truncation=False))))
print(len(set(tokenizer.tokenize(text_l1, add_special_tokens=False, truncation=False))))
print(len(set(tokenizer.tokenize(text_l2, add_special_tokens=False, truncation=False))))
615 584 545
Lemmatization did not shorten the number of tokens significantly. The number of words in vocabulary decreased. For now saving this without lemmaiaion. For models such as BERT lemmatiation is not required. However, for LSTM, lemmatization is needed. Once we persist the pre-processed textx without lemmatization, we will lemmatize them as required in this notebook.
movie_reviews['review_summary_num_tokens']=movie_reviews['review_summary'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['review_text_num_tokens']= movie_reviews['review_text'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['plot_synopsis_num_tokens']= movie_reviews['plot_synopsis'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
movie_reviews['plot_summary_num_tokens']= movie_reviews['plot_summary'].map(lambda x: len(tokenizer.tokenize(x, add_special_tokens=False, truncation=False)))
# Write the resulting data frame as csv for later use
import csv
csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")
movie_reviews.to_csv(csv_file_path, index=False)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-2-989cec618348> in <cell line: 5>() 3 import csv 4 ----> 5 csv_file_path = os.path.join(dataset_path, "movie_reviews.csv") 6 7 movie_reviews.to_csv(csv_file_path, index=False) NameError: name 'os' is not defined
EDA¶
import csv
import os
import pandas as pd
dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")
csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")
movie_reviews = pd.read_csv(csv_file_path)
movie_reviews
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | review_summary_num_tokens | review_text_num_tokens | plot_synopsis_num_tokens | plot_summary_num_tokens | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 5 March 2008 | ur16517420 | True | the second tom clancy novel made into a film '... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both | 3 | 431 | 1797 | 217 |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 October 2007 | ur13887584 | True | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both | 4 | 129 | 1797 | 217 |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 16 September 2013 | ur2403537 | True | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both | 20 | 401 | 1797 | 217 |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 May 2007 | ur0819382 | True | jack ryan harrison ford is a cia analyst who s... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both | 3 | 1045 | 1797 | 217 |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 29 February 2004 | ur2567338 | True | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both | 7 | 236 | 1797 | 217 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 538823 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur78983408 | False | dunkirk is a beautifully done movie that has h... | 10 | possibly the greatest war film ever made | ef14e72a-77cd-4b9c-9f85-37d01c55fded | both | 7 | 175 | 1567 | 52 |
| 538824 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 2 August 2017 | ur53645293 | False | dunkirk is one of the rare cases a film receiv... | 10 | why this could actually be one of the best war... | d4bfc31d-ae8e-4e48-b616-936132153630 | both | 16 | 840 | 1567 | 52 |
| 538825 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 31 July 2017 | ur45202903 | False | film gave insufficient background on what was ... | 2 | disappointed | e47d2365-78d9-44c1-86e8-9c02d190b0cb | both | 1 | 57 | 1567 | 52 |
| 538826 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 28 July 2017 | ur1319404 | False | in screen writing, a shot is an image captured... | 7 | dunkirk for millennials | d985d458-0b1b-4cec-a5d2-6e9431770ce0 | both | 4 | 146 | 1567 | 52 |
| 538827 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 25 July 2017 | ur4161069 | False | in a movie that entirely engulfs you it's rath... | 10 | christopher nolan reaches yet another new high... | 88c38da2-d4f7-4ee2-ae1b-76c8d430d495 | both | 12 | 345 | 1567 | 52 |
538828 rows × 19 columns
import seaborn as sns
import matplotlib.pyplot as plt
Hist plot of number of tokens in review_text¶
# If there is correlation betwen num tokens in review_text and is_spoiler
sns.histplot(x='review_text_num_tokens', hue='is_spoiler', data=movie_reviews)
<Axes: xlabel='review_text_num_tokens', ylabel='Count'>
print(movie_reviews['review_text_num_tokens'].value_counts().sort_values(ascending=False))
review_text_num_tokens
156 2827
147 2802
158 2796
155 2785
151 2785
...
1778 1
2830 1
2293 1
1871 1
1536 1
Name: count, Length: 1614, dtype: int64
movie_reviews['review_text_num_tokens'].mode()
0 156 Name: review_text_num_tokens, dtype: int64
Histplot of all quantitative features¶
# Split Quantitative from Categorical
x_quantitative = ['rating_x', 'rating_y', 'review_text_num_tokens', 'review_summary_num_tokens','plot_synopsis_num_tokens', 'plot_summary_num_tokens']
y = ['is_spoiler']
#x_categorical = [feature for feature in movie_reviews.columns if ((feature not in x_quantitative) and (feature not in y)) ]
#x_categorical
def hist_plots(data, columns, rem_ol=False, thres=0.99, scale_graph=9, n_cols=3, aspect_ratio=2/3 ):
'''Create multiple Histograms plots using a subset of variables specified.
Args:
data: Input data-frame containing variables we wish to plot.
columns: Listing of column-names we wish to plot.
rem_ol: Remove observations greater than specific percentile defined by thres argument.
thres: Percentile that will be used if rem_ol=True.
scale_graph: Adjust the total size of the graph.
n_cols: Adjust how many graphs we have on each row.
aspect_ratio: Adjust the aspect ratio of each individual graph. For squared graphs use 1/1.
'''
# Adjusting how many rows the grid will have and proper sizes
n_rows = len(columns)//n_cols+(len(columns)%n_cols>0)
fig, axes = plt.subplots(n_rows, n_cols, figsize=(scale_graph, (scale_graph/n_cols)*aspect_ratio*n_rows))
# Plotting
fig.suptitle(f'Histograms of {len(columns)} columns',y=1, size=15)
axes=axes.flatten()
for i,feature in enumerate(columns):
if rem_ol:
lim = data[feature].quantile([thres]).iloc[0]
condition = data[feature]<lim
x = data.loc[condition, [feature, 'is_spoiler']]
print(f'{feature}: Observations greater than P{round(thres*100)} removed')
print(x.columns)
else:
x=data[[feature, 'is_spoiler']]
print(x.columns)
sns.histplot(x = feature, data=x, ax=axes[i], hue = 'is_spoiler');
axes[i].set_title(feature, size=10);
plt.tight_layout()
hist_plots(movie_reviews,x_quantitative, n_cols=2)
Index(['rating_x', 'is_spoiler'], dtype='object') Index(['rating_y', 'is_spoiler'], dtype='object') Index(['review_text_num_tokens', 'is_spoiler'], dtype='object') Index(['review_summary_num_tokens', 'is_spoiler'], dtype='object') Index(['plot_synopsis_num_tokens', 'is_spoiler'], dtype='object') Index(['plot_summary_num_tokens', 'is_spoiler'], dtype='object')
Th number of tokens in plot_summary looks more reasonable than in plot_synopsis. We should try training the model with plot_summary as well as plot_synopsis.
As number of tokens increases in review_text, the more likely it is to be a spoiler. There are few reviews with number of tokens below 30.
Relationship between genres and is_spoiler.¶
movie_reviews['genre'].describe()
count 538828 unique 218 top ['Action', 'Adventure', 'Sci-Fi'] freq 39020 Name: genre, dtype: object
movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(','))
movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else [])
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-22-e7d38f67f309> in <cell line: 1>() ----> 1 movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(',')) 2 movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else []) /usr/local/lib/python3.10/dist-packages/pandas/core/series.py in map(self, arg, na_action) 4395 dtype: object 4396 """ -> 4397 new_values = self._map_values(arg, na_action=na_action) 4398 return self._constructor(new_values, index=self.index, copy=False).__finalize__( 4399 self, method="map" /usr/local/lib/python3.10/dist-packages/pandas/core/base.py in _map_values(self, mapper, na_action) 922 923 # mapper is a function --> 924 new_values = map_f(values, mapper) 925 926 return new_values /usr/local/lib/python3.10/dist-packages/pandas/_libs/lib.pyx in pandas._libs.lib.map_infer() <ipython-input-22-e7d38f67f309> in <lambda>(x) ----> 1 movie_reviews['genre'] = movie_reviews['genre'].map(lambda x: x.split(',')) 2 movie_reviews['genre'] = movie_reviews['genre'].apply(lambda x: x if isinstance(x, list) else []) AttributeError: 'list' object has no attribute 'split'
movie_reviews['genre'].describe()
count 538828 unique 218 top [['Action', 'Adventure', 'Sci-Fi']] freq 39020 Name: genre, dtype: object
genres_spoilers = movie_reviews[['genre', 'is_spoiler']]
genres_spoilers = genres_spoilers.explode('genre')
#genres_spoilers = genres_spoilers.groupby(['genre', 'is_spoiler']).size().reset_index(name='count')
genres_spoilers
| genre | is_spoiler | |
|---|---|---|
| 0 | ['Action' | True |
| 0 | 'Thriller'] | True |
| 1 | ['Action' | True |
| 1 | 'Thriller'] | True |
| 2 | ['Action' | True |
| ... | ... | ... |
| 538826 | 'Drama' | False |
| 538826 | 'History'] | False |
| 538827 | ['Action' | False |
| 538827 | 'Drama' | False |
| 538827 | 'History'] | False |
1430664 rows × 2 columns
sns.countplot(data=genres_spoilers, x='genre', hue='is_spoiler')
# Rotate x-axis labels by 90 degrees
plt.xticks(rotation=90)
# Show the plot
plt.show()
A portion of all genres seems to be spoilers. No clear correlation found.
Deep Learning training on set of 10000 positive and 10000 negative examples¶
Take a sample of 10000 rows and test for review-Text and plot_summary
import csv
import os
import pandas as pd
dataset_path = os.path.join(prefix_path, "dataset", "imdb_spoiler")
csv_file_path = os.path.join(dataset_path, "movie_reviews.csv")
movie_reviews = pd.read_csv(csv_file_path)
# Convert boolean is_spoiler to 0 and 1.
movie_reviews['is_spoiler'] = movie_reviews['is_spoiler'].astype(int)
movie_reviews = movie_reviews[movie_reviews['review_text_num_tokens'] >= 50]
movie_reviews['sum num tokens review_text plus plot summary'] = movie_reviews['review_text_num_tokens'] + movie_reviews['plot_summary_num_tokens']
import seaborn as sns
import matplotlib.pyplot as plt
sns.histplot(x='sum num tokens review_text plus plot summary', hue='is_spoiler', data=movie_reviews)
<Axes: xlabel='sum num tokens review_text plus plot summary', ylabel='Count'>
Using a longfornmer where a seq length of 4096 will allow us to process both review text and plot_summary, and all of it as we don;t see anything going beyond that limit where the combined lengths of plot_summary and review text is beyond 4096.
movie_reviews[movie_reviews['is_spoiler'] == 1]
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | review_summary_num_tokens | review_text_num_tokens | plot_synopsis_num_tokens | plot_summary_num_tokens | sum num tokens review_text plus plot summary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 5 March 2008 | ur16517420 | 1 | the second tom clancy novel made into a film '... | 6 | decent clancy thriller | bbc4b9ff-ea1b-455a-8b7e-32370d35fe17 | both | 3 | 431 | 1797 | 217 | 648 |
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 October 2007 | ur13887584 | 1 | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both | 4 | 129 | 1797 | 217 | 346 |
| 2 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 16 September 2013 | ur2403537 | 1 | i was not a fan of the hunt for red october. i... | 8 | it's filled to the brim with excitement and it... | b28cbca4-dc30-421c-8e3a-dc2e3cba98c1 | both | 20 | 401 | 1797 | 217 | 618 |
| 3 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 May 2007 | ur0819382 | 1 | jack ryan harrison ford is a cia analyst who s... | 7 | patriotic gore. | c657363f-224d-48db-a029-8e4ddd2b248d | both | 3 | 1045 | 1797 | 217 | 1262 |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 29 February 2004 | ur2567338 | 1 | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both | 7 | 236 | 1797 | 217 | 453 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 538653 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 21 July 2017 | ur32188727 | 1 | this is a nolan film and like every other nola... | 7 | a visual feast and amazing film. | 05852db1-f196-4ee7-9ec8-18fbb7b5e963 | both | 7 | 589 | 1567 | 52 | 641 |
| 538654 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 24 July 2017 | ur30392044 | 1 | world war ii movies are a genre that until ver... | 9 | steals pure cinema back from the grave dug by ... | 038eeaa9-a1c1-4976-8ddd-2b13af0c158b | both | 11 | 823 | 1567 | 52 | 875 |
| 538655 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 24 July 2017 | ur68483081 | 1 | before we talk about the film let me tell you ... | 10 | dunkirk met a genuine dunkirk veteran at my sc... | ec4ac50a-8d4b-4076-9fa1-739a590f795d | both | 10 | 587 | 1567 | 52 | 639 |
| 538656 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 21 July 2017 | ur37909248 | 1 | from the beginning til the end i held my breat... | 10 | i had high expectation and the movie delivered. | ee91c48d-7f2e-4acc-be47-e3811b109c12 | both | 9 | 96 | 1567 | 52 | 148 |
| 538657 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 20 July 2017 | ur69070086 | 1 | one of the best war movies i have ever watched... | 10 | the battle of dunkirk | 7c71c712-e04b-47e3-a8cc-b4632574d368 | both | 4 | 94 | 1567 | 52 | 146 |
143055 rows × 20 columns
movie_reviews[(movie_reviews['sum num tokens review_text plus plot summary'] < 512) & (movie_reviews['is_spoiler'] == 1)]
| movie_id | plot_summary | duration | genre | rating_x | release_date | plot_synopsis | review_date | user_id | is_spoiler | review_text | rating_y | review_summary | review_id | _merge | review_summary_num_tokens | review_text_num_tokens | plot_synopsis_num_tokens | plot_summary_num_tokens | sum num tokens review_text plus plot summary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 15 October 2007 | ur13887584 | 1 | the second in what looks like becoming the 'ja... | 7 | jack ryan chapter ii | e04d4f23-89b5-4715-add8-1ee37d46dd99 | both | 4 | 129 | 1797 | 217 | 346 |
| 4 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 29 February 2004 | ur2567338 | 1 | this was one of the big summer movies of 1992.... | 7 | harrison ford in another tom clancy adaptation | 44c6a555-674d-4f1c-a9fa-88b35b82c8ba | both | 7 | 236 | 1797 | 217 | 453 |
| 5 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 7 June 2002 | ur0002746 | 1 | this is a sorry excuse of a movie. i do not kn... | 4 | director fell asleep during movie. | 6f7c7135-e16c-4fc9-9f89-1a1bac6b20c3 | both | 6 | 270 | 1797 | 217 | 487 |
| 8 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 7 August 2016 | ur45215448 | 1 | when cia analyst jack ryan interferes with an ... | 9 | the best of the ryan adaptations so far!!! | 4a4c44f2-dc3b-41ac-a663-a1156bf1f940 | both | 11 | 143 | 1797 | 217 | 360 |
| 11 | tt0105112 | former cia analyst, jack ryan is in england wi... | 1h 57min | ['Action', 'Thriller'] | 6.9 | 1992-06-05 | jack ryan ford is on a working vacation in lon... | 9 March 2008 | ur12816955 | 1 | harrison ford is dry and stiff as an actor in ... | 2 | what a bad film | e5b72fb2-b032-41e1-a3ee-0d6830a2b60f | both | 4 | 251 | 1797 | 217 | 468 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 538646 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 30 July 2017 | ur31231231 | 1 | this film was filmed amazingly, but that is wh... | 4 | a mishmash of a film, disappointing | 571b4560-5e67-4613-b56e-8ac90966adad | both | 9 | 328 | 1567 | 52 | 380 |
| 538647 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 27 July 2017 | ur0937743 | 1 | dunkirk 2017 12 fionn whitehead, aneurin barna... | 7 | shrugworthy, meh nolan's first big disappointi... | 8e015f1e-164e-49f9-afb0-8486e529fadd | both | 14 | 273 | 1567 | 52 | 325 |
| 538648 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 23 July 2017 | ur37359896 | 1 | a lot of good movies have come in war genre an... | 8 | people say this movie lack characters and stor... | 46982c0a-7318-4612-9559-124a8b5aad2c | both | 17 | 318 | 1567 | 52 | 370 |
| 538656 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 21 July 2017 | ur37909248 | 1 | from the beginning til the end i held my breat... | 10 | i had high expectation and the movie delivered. | ee91c48d-7f2e-4acc-be47-e3811b109c12 | both | 9 | 96 | 1567 | 52 | 148 |
| 538657 | tt5013056 | evacuation of allied soldiers from the british... | 1h 46min | ['Action', 'Drama', 'History'] | 8.1 | 2017-07-21 | the film alternates between three different pe... | 20 July 2017 | ur69070086 | 1 | one of the best war movies i have ever watched... | 10 | the battle of dunkirk | 7c71c712-e04b-47e3-a8cc-b4632574d368 | both | 4 | 94 | 1567 | 52 | 146 |
83879 rows × 20 columns
83879 / 143055
0.5863409178288072
58% of spoilers have the length pf review_text + plot_summary less than 512.
movie_reviews.count()
movie_id 538828 plot_summary 538828 duration 538828 genre 538828 rating_x 538828 release_date 538828 plot_synopsis 538828 review_date 538828 user_id 538828 is_spoiler 538828 review_text 538828 rating_y 538828 review_summary 538729 review_id 538828 _merge 538828 review_summary_num_tokens 538828 review_text_num_tokens 538828 plot_synopsis_num_tokens 538828 plot_summary_num_tokens 538828 dtype: int64
movie_reviews.count()
movie_id 535083 plot_summary 535083 duration 535083 genre 535083 rating_x 535083 release_date 535083 plot_synopsis 535083 review_date 535083 user_id 535083 is_spoiler 535083 review_text 535083 rating_y 535083 review_summary 534986 review_id 535083 _merge 535083 review_summary_num_tokens 535083 review_text_num_tokens 535083 plot_synopsis_num_tokens 535083 plot_summary_num_tokens 535083 dtype: int64
movie_reviews['is_spoiler'].value_counts()
is_spoiler False 392028 True 143055 Name: count, dtype: int64
import numpy as np
from sklearn.model_selection import train_test_split
X = movie_reviews[['review_text', 'plot_summary']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
<ipython-input-9-1f8781bb0975>:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy X['label'] = movie_reviews['is_spoiler'].astype(int)
Bidirectional LSTM¶
We can try to train the bidirectional LSTM for
import numpy as np
from sklearn.model_selection import train_test_split
X = movie_reviews[['review_text']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
<ipython-input-62-ca1b24491510>:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy X['label'] = movie_reviews['is_spoiler'].astype(int)
# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]
# Number of samples to match
num_negative = len(negative_x)
print(num_negative)
print(len(positive_x))
392028 143055
# Upsample negative samples
positive_x = positive_x.sample(n=10000, replace=False, random_state=42)
negative_x = negative_x.sample(n=10000, replace=False, random_state=42)
# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])
# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
!pip install nltk
!pip install spacy
!python -m spacy download en_core_web_sm
Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (3.8.1)
Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk) (8.1.7)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk) (1.4.2)
Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk) (2024.5.15)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from nltk) (4.66.4)
Requirement already satisfied: spacy in /usr/local/lib/python3.10/dist-packages (3.7.5)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy) (8.2.5)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.0.10)
Requirement already satisfied: weasel<0.5.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (0.4.1)
Requirement already satisfied: typer<1.0.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (0.12.3)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (4.66.4)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.32.3)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy) (2.8.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.1.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy) (67.7.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (24.1)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (3.4.0)
Requirement already satisfied: numpy>=1.19.0 in /usr/local/lib/python3.10/dist-packages (from spacy) (1.25.2)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.10/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy) (1.2.0)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (2.20.1)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2024.7.4)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.1.5)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (8.1.7)
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (1.5.4)
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy) (13.7.1)
Requirement already satisfied: cloudpathlib<1.0.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy) (0.18.1)
Requirement already satisfied: smart-open<8.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy) (7.0.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy) (2.1.5)
Requirement already satisfied: marisa-trie>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy) (1.2.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (2.16.1)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from smart-open<8.0.0,>=5.2.1->weasel<0.5.0,>=0.1.0->spacy) (1.14.1)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy) (0.1.2)
Collecting en-core-web-sm==3.7.1
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.8/12.8 MB 92.5 MB/s eta 0:00:00
Requirement already satisfied: spacy<3.8.0,>=3.7.2 in /usr/local/lib/python3.10/dist-packages (from en-core-web-sm==3.7.1) (3.7.5)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.2.5)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.10)
Requirement already satisfied: weasel<0.5.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.4.1)
Requirement already satisfied: typer<1.0.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.12.3)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.66.4)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.32.3)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.8.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.1.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (67.7.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (24.1)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.4.0)
Requirement already satisfied: numpy>=1.19.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.25.2)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.10/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.2.0)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.20.1)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2024.7.4)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.1.5)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.1.7)
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.5.4)
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (13.7.1)
Requirement already satisfied: cloudpathlib<1.0.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.18.1)
Requirement already satisfied: smart-open<8.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (7.0.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.1.5)
Requirement already satisfied: marisa-trie>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.2.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.16.1)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from smart-open<8.0.0,>=5.2.1->weasel<0.5.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.14.1)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer<1.0.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.1.2)
✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
⚠ Restart to reload dependencies
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
import nltk
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
# Download necessary NLTK data
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')
nltk.download('averaged_perceptron_tagger') # Download the missing tagger model
# Function to convert nltk POS tags to wordnet POS tags
def get_wordnet_pos(word):
tag = nltk.pos_tag([word])[0][1][0].upper()
tag_dict = {"J": wordnet.ADJ,
"N": wordnet.NOUN,
"V": wordnet.VERB,
"R": wordnet.ADV}
return tag_dict.get(tag, wordnet.NOUN)
# Lemmatize function
def lemmatize_text(text):
lemmatizer = WordNetLemmatizer()
words = word_tokenize(text)
lemmatized_words = [lemmatizer.lemmatize(word, get_wordnet_pos(word)) for word in words]
return ' '.join(lemmatized_words)
[nltk_data] Downloading package punkt to /root/nltk_data... [nltk_data] Unzipping tokenizers/punkt.zip. [nltk_data] Downloading package wordnet to /root/nltk_data... [nltk_data] Downloading package omw-1.4 to /root/nltk_data... [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] /root/nltk_data... [nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
balanced_x['review_text'] = balanced_x['review_text'].map(lambda x: lemmatize_text(x))
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
from gensim.models import Word2Vec
import nltk
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
import spacy
# Assuming you have lemmatized_texts from previous step
# Tokenize and pad sequences as before
tokenizer = Tokenizer()
tokenizer.fit_on_texts(balanced_x['review_text'])
sequences = tokenizer.texts_to_sequences(balanced_x['review_text'])
word_index = tokenizer.word_index
# Pad sequences
max_length = max(len(seq) for seq in sequences)
data = pad_sequences(sequences, maxlen=max_length)
# Split data into training and test sets
X_train_temp, X_test, y_train_temp, y_test = train_test_split(data, balanced_x['label'], test_size=0.1, random_state=42, stratify=balanced_x['label'])
# Further split training data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train_temp, y_train_temp, test_size=0.1, random_state=42, stratify=y_train_temp)
# As before, using lemmatized sentences
sentences = [text.split() for text in balanced_x['review_text']]
word2vec_model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)
# Create embedding matrix
embedding_dim = 100 # This should match the vector_size of Word2Vec model
embedding_matrix = np.zeros((len(word_index) + 1, embedding_dim))
for word, i in word_index.items():
if word in word2vec_model.wv:
embedding_matrix[i] = word2vec_model.wv[word]
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
from tensorflow.keras.callbacks import EarlyStopping
model = Sequential()
model.add(Embedding(input_dim=len(word_index) + 1,
output_dim=embedding_dim,
weights=[embedding_matrix],
input_length=max_length,
trainable=False))
model.add(Bidirectional(LSTM(64)))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Early stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)
# Train the model
history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stopping])
Epoch 1/100 507/507 [==============================] - 53s 97ms/step - loss: 0.6934 - accuracy: 0.5193 - val_loss: 0.6898 - val_accuracy: 0.5278 Epoch 2/100 507/507 [==============================] - 49s 96ms/step - loss: 0.6642 - accuracy: 0.6007 - val_loss: 0.6567 - val_accuracy: 0.6300 Epoch 3/100 507/507 [==============================] - 49s 96ms/step - loss: 0.6240 - accuracy: 0.6570 - val_loss: 0.6363 - val_accuracy: 0.6372 Epoch 4/100 507/507 [==============================] - 49s 96ms/step - loss: 0.5978 - accuracy: 0.6834 - val_loss: 0.6281 - val_accuracy: 0.6378 Epoch 5/100 507/507 [==============================] - 49s 96ms/step - loss: 0.5755 - accuracy: 0.7009 - val_loss: 0.6268 - val_accuracy: 0.6544 Epoch 6/100 507/507 [==============================] - 49s 96ms/step - loss: 0.5507 - accuracy: 0.7194 - val_loss: 0.6283 - val_accuracy: 0.6528 Epoch 7/100 507/507 [==============================] - 49s 97ms/step - loss: 0.5245 - accuracy: 0.7417 - val_loss: 0.6355 - val_accuracy: 0.6567 Epoch 8/100 507/507 [==============================] - 49s 96ms/step - loss: 0.4886 - accuracy: 0.7656 - val_loss: 0.6512 - val_accuracy: 0.6439
plt.figure(figsize=(10, 5))
plt.plot(range(1, 8 + 1), history.history['loss'], label='Train Loss')
plt.plot(range(1, 8 + 1), history.history['val_loss'], label='Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Losses')
plt.legend()
plt.grid(True)
plt.show()
Evaluating BiLSTM and confusion matrix¶
# Make predictions
y_pred_probs = model.predict(X_test)
y_pred = (y_pred_probs > 0.5).astype("int32")
# Generate confusion matrix
cm = confusion_matrix(y_test, y_pred)
# Visualize confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Negative', 'Positive'], yticklabels=['Negative', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()
# Print classification report
print(classification_report(y_test, y_pred, target_names=['Negative', 'Positive']))
63/63 [==============================] - 3s 43ms/step
precision recall f1-score support
Negative 0.66 0.60 0.63 1000
Positive 0.64 0.69 0.66 1000
accuracy 0.65 2000
macro avg 0.65 0.65 0.65 2000
weighted avg 0.65 0.65 0.65 2000
Calibration plot of BiLSTM¶
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np
# Assuming `model` is your trained BiLSTM model and `X_val` is your validation data
# Get predicted probabilities from BiLSTM
probs = model.predict(X_val)
# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(probs, (-1,))
# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), y_val)
# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
57/57 [==============================] - 3s 43ms/step
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve
# Plot calibration curve
prob_true, prob_pred = calibration_curve(y_val, probs[:, 1], n_bins=10)
plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='BiLSTM')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
The calibration plot follows the 45 degree line closely showing that the model is reliable.
Transformer based model¶
Here, we use pre-trained transformer based models. We sample an equal number of positive and negative examples from th e full dataset for the experiments here, as utilizing the full dataset was not practical , given the resouces available.
import numpy as np
from sklearn.model_selection import train_test_split
X = movie_reviews[['review_text', 'plot_summary']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]
# Upsample negative samples
positive_x = positive_x.sample(n=5000, replace=False, random_state=42)
negative_x = negative_x.sample(n=5000, replace=False, random_state=42)
# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])
# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
<ipython-input-14-f64f9dfaa09f>:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy X['label'] = movie_reviews['is_spoiler'].astype(int)
Using pre-trained bert-base-uncased BertForSequenceClassification model¶
max_length = 512
!pip install transformers torch peft
Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.42.4) Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (2.3.1+cu121) Requirement already satisfied: peft in /usr/local/lib/python3.10/dist-packages (0.11.1) Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.15.4) Requirement already satisfied: huggingface-hub<1.0,>=0.23.2 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.23.5) Requirement already satisfied: numpy<2.0,>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.25.2) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (24.1) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2024.5.15) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.32.3) Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.4.3) Requirement already satisfied: tokenizers<0.20,>=0.19 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.19.1) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.66.4) Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages (from torch) (4.12.2) Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch) (1.13.0) Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch) (3.3) Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.4) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch) (2023.6.0) Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /usr/local/lib/python3.10/dist-packages (from torch) (8.9.2.26) Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.3.1) Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /usr/local/lib/python3.10/dist-packages (from torch) (11.0.2.54) Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /usr/local/lib/python3.10/dist-packages (from torch) (10.3.2.106) Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /usr/local/lib/python3.10/dist-packages (from torch) (11.4.5.107) Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.0.106) Requirement already satisfied: nvidia-nccl-cu12==2.20.5 in /usr/local/lib/python3.10/dist-packages (from torch) (2.20.5) Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch) (12.1.105) Requirement already satisfied: triton==2.3.1 in /usr/local/lib/python3.10/dist-packages (from torch) (2.3.1) Requirement already satisfied: nvidia-nvjitlink-cu12 in /usr/local/lib/python3.10/dist-packages (from nvidia-cusolver-cu12==11.4.5.107->torch) (12.5.82) Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from peft) (5.9.5) Requirement already satisfied: accelerate>=0.21.0 in /usr/local/lib/python3.10/dist-packages (from peft) (0.32.1) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch) (2.1.5) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.7) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2024.7.4) Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch) (1.3.0)
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig
# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('bert-base-uncased',
num_labels=2,
max_position_embeddings=max_length
)
# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')
with open(config_path, 'w') as f:
json.dump(config.to_dict(), f)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased', config=config)
# Load pre-trained model with custom configuration
model =AutoModelForSequenceClassification.from_pretrained(
'bert-base-uncased',
config=config_path, # Use the path to the configuration file
cache_dir= prefix_path + '/cache', # Cache directory
force_download=False, # Force download the model weights
resume_download=False, # Resume download if it was previously interrupted
proxies=None, # Proxy servers
output_loading_info=False, # Do not return loading info
local_files_only=False, # Look for local files
use_auth_token=False, # Use authentication token if needed
revision='main', # Specific model version
mirror=None, # Mirror repository URL
#gradient_checkpointing=True,
ignore_mismatched_sizes=True
)
from peft import LoraConfig, get_peft_model
# Define LoRA configuration
lora_config = LoraConfig(
r=8, # The rank of the LoRA matrix
lora_alpha=32, # Scaling factor
target_modules=["query", "value"], # Target modules for LoRA
lora_dropout=0.1, # Dropout rate
bias="none" # Bias handling
)
# Apply LoRA to the model
model = get_peft_model(model, lora_config)
#model = BertForSequenceClassification.from_pretrained('bert-base-uncased', config=config, ignore_mismatched_sizes=True)
# Enable gradient checkpointing
#model.gradient_checkpointing_enable()
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: The secret `HF_TOKEN` does not exist in your Colab secrets. To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session. You will be able to reuse this secret in all of your notebooks. Please note that authentication is recommended but still optional to access public models or datasets. warnings.warn( /usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:468: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead. warnings.warn( Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
from sklearn.model_selection import train_test_split
# Split data into train, validation, and test sets
train_df, temp_df = train_test_split(balanced_x, test_size=0.2, random_state=42, stratify=balanced_x['label'])
eval_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42, stratify=temp_df['label'])
import pandas as pd
import torch
from torch.utils.data import DataLoader, Dataset
from sklearn.model_selection import train_test_split
from transformers import RobertaTokenizer, RobertaForSequenceClassification, RobertaConfig, Trainer, TrainingArguments, default_data_collator
from sklearn.metrics import accuracy_score
class SentencePairDataset(Dataset):
def __init__(self, dataframe, tokenizer, max_length):
self.dataframe = dataframe
self.tokenizer = tokenizer
self.max_length = max_length
def __len__(self):
return len(self.dataframe)
def __getitem__(self, idx):
sentence1 = self.dataframe.iloc[idx, 0]
sentence2 = self.dataframe.iloc[idx, 1]
label = self.dataframe.iloc[idx, 2]
inputs = self.tokenizer.encode_plus(
sentence1,
sentence2,
add_special_tokens=True,
max_length=self.max_length,
truncation=True,
padding="max_length",
return_tensors="pt",
return_attention_mask=True,
return_token_type_ids=True,
return_overflowing_tokens=False,
return_length=True,
pad_to_max_length=True
)
inputs["labels"] = torch.tensor(label, dtype=torch.long)
return {key: val.squeeze(0) for key, val in inputs.items()}
# Create dataset objects
train_dataset = SentencePairDataset(train_df, tokenizer, max_length)
eval_dataset = SentencePairDataset(eval_df, tokenizer, max_length)
test_dataset = SentencePairDataset(test_df, tokenizer, max_length)
train_dataset.__getitem__(0)
{'input_ids': tensor([ 101, 2748, 2023, 2003, 1037, 10225, 25318, 3185, 10303, 2045,
2003, 2070, 11749, 5171, 10225, 25318, 13764, 8649, 1998, 1037,
8552, 1998, 13749, 18595, 3560, 2466, 1012, 2021, 2242, 2003,
4394, 2182, 1010, 2926, 4102, 2000, 10225, 25318, 1005, 1055,
2010, 2060, 2147, 1012, 2364, 3291, 2003, 1045, 2228, 1996,
3494, 1012, 2027, 2024, 2204, 2035, 1998, 18080, 1999, 1037,
2126, 2021, 2027, 2024, 2025, 2428, 13432, 1012, 2926, 1996,
2364, 2839, 9901, 2829, 14089, 24665, 3771, 2003, 15640, 1012,
2016, 2003, 2025, 2428, 2844, 2438, 2004, 1037, 2839, 2000,
4287, 2023, 2972, 3185, 1012, 2613, 2364, 2839, 1997, 1996,
3185, 2000, 2033, 2001, 5212, 1048, 1012, 4027, 2021, 2339,
2821, 2339, 2106, 2017, 2018, 2000, 4929, 2008, 5976, 2559,
18865, 3520, 1029, 2839, 2008, 1045, 5621, 4669, 2001, 2728,
2139, 9152, 3217, 1005, 1055, 1012, 2002, 3248, 1037, 4066,
1997, 10916, 2535, 2066, 2017, 2052, 2025, 5987, 2013, 2032,
1999, 1037, 3185, 2066, 2023, 1012, 1996, 2717, 1997, 1996,
2839, 2024, 2025, 2428, 2844, 2030, 5875, 2438, 2000, 4287,
2023, 2972, 3185, 1012, 2428, 10473, 5783, 1997, 1996, 3185,
2003, 1996, 8235, 2135, 2517, 5896, 2008, 2320, 2153, 2003,
3561, 2007, 2070, 5171, 10225, 25318, 3787, 1998, 21438, 1012,
2748, 1010, 1996, 4566, 2453, 2018, 2042, 1037, 2978, 5410,
1998, 3383, 15640, 2021, 2053, 2009, 2106, 2025, 9868, 1996,
3185, 2005, 2033, 1012, 1045, 2145, 5632, 2023, 3185, 1010,
2009, 2074, 2347, 1005, 1056, 2004, 2204, 2004, 1045, 2018,
5113, 2005, 1012, 2009, 2001, 2785, 1997, 2066, 1996, 3185,
2041, 1997, 4356, 2011, 7112, 2061, 4063, 4059, 2232, 2029,
2038, 1037, 2785, 1997, 2714, 3857, 22264, 27983, 1998, 2466,
1010, 2009, 3310, 2200, 2485, 2012, 2335, 2021, 2009, 2196,
3243, 3138, 1996, 9850, 1012, 2145, 1037, 6669, 3422, 3085,
1010, 3262, 4569, 3185, 2021, 2145, 2017, 2024, 2488, 1997,
3666, 10225, 25318, 1005, 1055, 2060, 2147, 1012, 27671, 102,
1996, 2690, 18655, 17946, 7971, 9901, 2829, 20673, 17125, 2769,
2013, 3290, 2000, 3050, 3349, 2005, 1996, 2608, 11033, 2030,
12662, 12289, 1012, 2043, 2016, 4152, 3236, 2011, 1996, 6074,
4097, 9851, 2102, 1998, 2928, 18243, 12349, 2007, 2702, 4595,
6363, 1998, 16034, 1999, 2014, 8722, 1010, 2027, 16599, 1037,
3066, 2000, 2014, 2000, 2393, 2068, 2000, 6545, 2030, 12662,
1999, 3863, 1997, 2014, 4071, 1012, 5564, 2030, 12662, 5176,
1996, 5179, 29100, 11614, 4098, 9115, 1010, 2040, 3216, 1037,
15358, 5416, 2449, 1010, 2000, 2713, 9901, 2829, 2007, 1996,
6808, 1997, 15349, 2014, 1012, 9901, 13172, 1997, 2030, 12662,
1005, 1055, 6808, 1998, 14811, 1037, 8552, 7023, 2208, 2007,
4098, 2000, 8954, 2431, 1037, 2454, 6363, 2013, 2030, 12662,
1012, 2517, 2011, 19569, 2482, 10175, 6806, 1010, 5673, 2139,
11497, 1010, 4380, 102, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]),
'token_type_ids': tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]),
'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]),
'length': tensor(512),
'labels': tensor(0)}
# DataLoader
train_loader = DataLoader(train_dataset, batch_size=1, shuffle=True, collate_fn=default_data_collator)
val_loader = DataLoader(eval_dataset, batch_size=1, collate_fn=default_data_collator)
test_loader = DataLoader(test_dataset, batch_size=1, collate_fn=default_data_collator)
import torch
# Check if CUDA is available
if torch.cuda.is_available():
device_id = torch.cuda.current_device() # Get the current device ID
torch.cuda.empty_cache()
torch.cuda.reset_peak_memory_stats(device_id) # Pass the device ID
print(torch.cuda.memory_summary(device=None, abbreviated=False))
else:
print("CUDA is not available. Please check your installation.")
|===========================================================================| | PyTorch CUDA memory summary, device ID 0 | |---------------------------------------------------------------------------| | CUDA OOMs: 0 | cudaMalloc retries: 0 | |===========================================================================| | Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed | |---------------------------------------------------------------------------| | Allocated memory | 0 B | 0 B | 0 B | 0 B | | from large pool | 0 B | 0 B | 0 B | 0 B | | from small pool | 0 B | 0 B | 0 B | 0 B | |---------------------------------------------------------------------------| | Active memory | 0 B | 0 B | 0 B | 0 B | | from large pool | 0 B | 0 B | 0 B | 0 B | | from small pool | 0 B | 0 B | 0 B | 0 B | |---------------------------------------------------------------------------| | Requested memory | 0 B | 0 B | 0 B | 0 B | | from large pool | 0 B | 0 B | 0 B | 0 B | | from small pool | 0 B | 0 B | 0 B | 0 B | |---------------------------------------------------------------------------| | GPU reserved memory | 0 B | 0 B | 0 B | 0 B | | from large pool | 0 B | 0 B | 0 B | 0 B | | from small pool | 0 B | 0 B | 0 B | 0 B | |---------------------------------------------------------------------------| | Non-releasable memory | 0 B | 0 B | 0 B | 0 B | | from large pool | 0 B | 0 B | 0 B | 0 B | | from small pool | 0 B | 0 B | 0 B | 0 B | |---------------------------------------------------------------------------| | Allocations | 0 | 0 | 0 | 0 | | from large pool | 0 | 0 | 0 | 0 | | from small pool | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | Active allocs | 0 | 0 | 0 | 0 | | from large pool | 0 | 0 | 0 | 0 | | from small pool | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | GPU reserved segments | 0 | 0 | 0 | 0 | | from large pool | 0 | 0 | 0 | 0 | | from small pool | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | Non-releasable allocs | 0 | 0 | 0 | 0 | | from large pool | 0 | 0 | 0 | 0 | | from small pool | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | Oversize allocations | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | Oversize GPU segments | 0 | 0 | 0 | 0 | |===========================================================================|
from transformers import EarlyStoppingCallback
# Early stopping callback
early_stopping = EarlyStoppingCallback(early_stopping_patience=2, early_stopping_threshold=0.0,)
import numpy as np
from datasets import load_metric
def compute_metrics(p):
accuracy_metric = load_metric("accuracy")
predictions = np.argmax(p.predictions, axis=1)
accuracy = accuracy_metric.compute(predictions=predictions, references=p.label_ids)
return accuracy
# Training arguments
training_args = TrainingArguments(
output_dir= prefix_path + '/results',
num_train_epochs=10,
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
warmup_steps=500,
weight_decay=0.01,
logging_dir= prefix_path + '/logs',
logging_steps=10,
evaluation_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
metric_for_best_model='eval_loss',
greater_is_better=False,
gradient_accumulation_steps=12,
fp16=True,
label_names=["labels"]
)
/usr/local/lib/python3.10/dist-packages/transformers/training_args.py:1494: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead warnings.warn(
# Custom Trainer class
class CustomTrainer(Trainer):
def __init__(
self, # Add self
model,
args, # Change model to args
train_dataset=None,
eval_dataset=None,
data_collator=None,
compute_metrics=None,
tokenizer=None, # Pass the tokenizer here
callbacks=None
):
super().__init__(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer, # Pass the tokenizer here
callbacks=[early_stopping]
)
# Check your training data
#print(train_dataset[0]) # Print the first example in your training dataset
#print(train_dataset.column_names) # Print the column names of your training dataset
def save_model(self, output_dir=None, _internal_call=False):
# Only save the model and tokenizer, not the optimizer state
if output_dir is None:
output_dir = self.args.output_dir
self.model.save_pretrained(output_dir)
# Convert BertConfig to a JSON-serializable dictionary
#tokenizer_config_dict = self.tokenizer.config.to_dict()
#self.tokenizer.save_pretrained(output_dir, config=tokenizer_config_dict) # Pass the dictionary here
self._rotate_checkpoints(use_mtime=True)
# Trainer
trainer = CustomTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer, # Pass the tokenizer here
callbacks=[early_stopping]
)
# Train the model
trainer.train()
# Save the model, tokenizer, and trainer state
model.save_pretrained(prefix_path + '/results')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results') # Save the trained model
trainer.save_state()
# Save the model and tokenizer
model_save_path = prefix_path + '/results' + "/model"
tokenizer_save_path = prefix_path + '/results' + "./tokenizer"
trainer.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)
# Save the trainer state
trainer.state.save_to_json( prefix_path + '/results' + "/trainer_state.json")
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.676600 | 0.645086 | 0.635000 |
| 1 | 0.569500 | 0.608502 | 0.661000 |
| 3 | 0.580300 | 0.580992 | 0.692000 |
| 4 | 0.561200 | 0.574466 | 0.698000 |
| 6 | 0.580600 | 0.571313 | 0.699000 |
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn(
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.676600 | 0.645086 | 0.635000 |
| 1 | 0.569500 | 0.608502 | 0.661000 |
| 3 | 0.580300 | 0.580992 | 0.692000 |
| 4 | 0.561200 | 0.574466 | 0.698000 |
| 6 | 0.580600 | 0.571313 | 0.699000 |
Evaluation on eval dataset and plot of training and validation loss over epochs¶
import matplotlib.pyplot as plt
# Initialize lists to hold the extracted values
val_losses = []
training_losses = []
epochs = []
# Track the last training loss before each eval_loss entry
last_training_loss = None
for entry in trainer.state.log_history:
# Extract training loss
# Update the last seen training loss
if 'loss' in entry:
last_training_loss = entry['loss']
# Extract eval_loss and corresponding epoch
if 'eval_loss' in entry:
val_losses.append(entry['eval_loss'])
training_losses.append(last_training_loss)
epochs.append(entry['epoch'])
# Plot training and validation loss
plt.plot(epochs, training_losses, label='Training Loss')
plt.plot(epochs, val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
# Evaluate the model on evaluation dataset
eval_results = trainer.evaluate()
# Print results
print(eval_results)
{'eval_loss': 0.5713134407997131, 'eval_accuracy': 0.699, 'eval_runtime': 21.0249, 'eval_samples_per_second': 47.563, 'eval_steps_per_second': 47.563, 'epoch': 7.9995}
Evaluation of transformer based model using test data, confusion matrix and classification report¶
y_pred_probs = trainer.predict(test_dataset)
y_pred_probs.predictions.shape
(1000, 2)
# Extract predicted labels
preds = np.argmax(y_pred_probs.predictions, axis=1)
# Compute confusion matrix
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt
cm = confusion_matrix(test_df['label'], preds)
# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()
# Print evaluation results
print(f"Evaluation results: {y_pred_probs.metrics}")
Evaluation results: {'test_loss': 0.5969365835189819, 'test_accuracy': 0.681, 'test_runtime': 22.5106, 'test_samples_per_second': 44.424, 'test_steps_per_second': 44.424}
from sklearn.metrics import classification_report # Import the missing function
# Print classification report
print(classification_report(test_df['label'], preds, target_names=['Negative', 'Positive']))
precision recall f1-score support
Negative 0.67 0.72 0.69 500
Positive 0.69 0.65 0.67 500
accuracy 0.68 1000
macro avg 0.68 0.68 0.68 1000
weighted avg 0.68 0.68 0.68 1000
Calibration report over eval dataset¶
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np
pred_eval = trainer.predict(eval_dataset)
preds = np.argmax(pred_eval.predictions, axis=1)
# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))
# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), pred_eval.label_ids)
# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve
# Plot calibration curve
prob_true, prob_pred = calibration_curve(pred_eval.label_ids, probs[:, 1], n_bins=10)
plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
Using pre-trained allenai/longformer-base-4096 LongformerForSequenceClassification model¶
max_length = 4096
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig
# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('allenai/longformer-base-4096',
num_labels=2,
max_position_embeddings=max_length
)
# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')
with open(config_path, 'w') as f:
json.dump(config.to_dict(), f)
tokenizer = AutoTokenizer.from_pretrained('allenai/longformer-base-4096', config=config)
# Load pre-trained model with custom configuration
model =AutoModelForSequenceClassification.from_pretrained(
'allenai/longformer-base-4096',
config=config_path, # Use the path to the configuration file
cache_dir= prefix_path + '/cache', # Cache directory
force_download=False, # Force download the model weights
resume_download=False, # Resume download if it was previously interrupted
proxies=None, # Proxy servers
output_loading_info=False, # Do not return loading info
local_files_only=False, # Look for local files
use_auth_token=False, # Use authentication token if needed
revision='main', # Specific model version
mirror=None, # Mirror repository URL
#gradient_checkpointing=True,
ignore_mismatched_sizes=True
)
from peft import LoraConfig, get_peft_model
# Define LoRA configuration
lora_config = LoraConfig(
r=8, # The rank of the LoRA matrix
lora_alpha=32, # Scaling factor
target_modules=["query", "value"], # Target modules for LoRA
lora_dropout=0.1, # Dropout rate
bias="none" # Bias handling
)
# Apply LoRA to the model
model = get_peft_model(model, lora_config)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn(
config.json: 0%| | 0.00/694 [00:00<?, ?B/s]
vocab.json: 0%| | 0.00/899k [00:00<?, ?B/s]
merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]
tokenizer.json: 0%| | 0.00/1.36M [00:00<?, ?B/s]
/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:468: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead. warnings.warn( Some weights of LongformerForSequenceClassification were not initialized from the model checkpoint at allenai/longformer-base-4096 and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
# Create dataset objects
train_dataset = SentencePairDataset(train_df, tokenizer, max_length)
eval_dataset = SentencePairDataset(eval_df, tokenizer, max_length)
test_dataset = SentencePairDataset(test_df, tokenizer, max_length)
train_dataset.__getitem__(0)
{'input_ids': tensor([ 0, 10932, 42, ..., 1, 1, 1]),
'token_type_ids': tensor([0, 0, 0, ..., 0, 0, 0]),
'attention_mask': tensor([1, 1, 1, ..., 0, 0, 0]),
'length': tensor(4096),
'labels': tensor(0)}
# DataLoader
train_loader = DataLoader(train_dataset, batch_size=1, shuffle=True, collate_fn=default_data_collator)
val_loader = DataLoader(eval_dataset, batch_size=1, collate_fn=default_data_collator)
test_loader = DataLoader(test_dataset, batch_size=1, collate_fn=default_data_collator)
import torch
# Check if CUDA is available
if torch.cuda.is_available():
device_id = torch.cuda.current_device() # Get the current device ID
torch.cuda.empty_cache()
torch.cuda.reset_peak_memory_stats(device_id) # Pass the device ID
print(torch.cuda.memory_summary(device=None, abbreviated=False))
else:
print("CUDA is not available. Please check your installation.")
|===========================================================================| | PyTorch CUDA memory summary, device ID 0 | |---------------------------------------------------------------------------| | CUDA OOMs: 0 | cudaMalloc retries: 0 | |===========================================================================| | Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed | |---------------------------------------------------------------------------| | Allocated memory | 450034 KiB | 450034 KiB | 88270 GiB | 88269 GiB | | from large pool | 444928 KiB | 444928 KiB | 54891 GiB | 54891 GiB | | from small pool | 5106 KiB | 5106 KiB | 33378 GiB | 33378 GiB | |---------------------------------------------------------------------------| | Active memory | 450034 KiB | 450034 KiB | 88270 GiB | 88269 GiB | | from large pool | 444928 KiB | 444928 KiB | 54891 GiB | 54891 GiB | | from small pool | 5106 KiB | 5106 KiB | 33378 GiB | 33378 GiB | |---------------------------------------------------------------------------| | Requested memory | 448927 KiB | 448927 KiB | 81565 GiB | 81565 GiB | | from large pool | 443822 KiB | 443822 KiB | 48188 GiB | 48188 GiB | | from small pool | 5105 KiB | 5105 KiB | 33377 GiB | 33377 GiB | |---------------------------------------------------------------------------| | GPU reserved memory | 528384 KiB | 528384 KiB | 254702 MiB | 254186 MiB | | from large pool | 501760 KiB | 501760 KiB | 930 MiB | 440 MiB | | from small pool | 26624 KiB | 26624 KiB | 253772 MiB | 253746 MiB | |---------------------------------------------------------------------------| | Non-releasable memory | 78349 KiB | 78349 KiB | 105710 GiB | 105710 GiB | | from large pool | 56832 KiB | 56832 KiB | 68984 GiB | 68984 GiB | | from small pool | 21517 KiB | 21517 KiB | 36725 GiB | 36725 GiB | |---------------------------------------------------------------------------| | Allocations | 399 | 399 | 102178 K | 102178 K | | from large pool | 77 | 77 | 25952 K | 25952 K | | from small pool | 322 | 322 | 76226 K | 76226 K | |---------------------------------------------------------------------------| | Active allocs | 399 | 399 | 102178 K | 102178 K | | from large pool | 77 | 77 | 25952 K | 25952 K | | from small pool | 322 | 322 | 76226 K | 76226 K | |---------------------------------------------------------------------------| | GPU reserved segments | 34 | 34 | 126929 | 126895 | | from large pool | 21 | 21 | 43 | 22 | | from small pool | 13 | 13 | 126886 | 126873 | |---------------------------------------------------------------------------| | Non-releasable allocs | 60 | 60 | 56747 K | 56747 K | | from large pool | 20 | 20 | 19373 K | 19373 K | | from small pool | 40 | 40 | 37374 K | 37374 K | |---------------------------------------------------------------------------| | Oversize allocations | 0 | 0 | 0 | 0 | |---------------------------------------------------------------------------| | Oversize GPU segments | 0 | 0 | 0 | 0 | |===========================================================================|
# Trainer
trainer = CustomTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer, # Pass the tokenizer here
callbacks=[early_stopping]
)
# Train the model
trainer.train()
# Save the model, tokenizer, and trainer state
model.save_pretrained(prefix_path + '/results')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results') # Save the trained model
trainer.save_state()
# Save the model and tokenizer
model_save_path = prefix_path + '/results' + "/model"
tokenizer_save_path = prefix_path + '/results' + "./tokenizer"
trainer.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)
# Save the trainer state
trainer.state.save_to_json( prefix_path + '/results' + "/trainer_state.json")
Initializing global attention on CLS token...
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.660300 | 0.646325 | 0.632000 |
| 1 | 0.558600 | 0.586706 | 0.697000 |
| 3 | 0.582600 | 0.574838 | 0.705000 |
| 4 | 0.514200 | 0.594929 | 0.689000 |
| 6 | 0.615000 | 0.571699 | 0.708000 |
| 7 | 0.509200 | 0.565953 | 0.718000 |
| 9 | 0.526000 | 0.560785 | 0.721000 |
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn(
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.660300 | 0.646325 | 0.632000 |
| 1 | 0.558600 | 0.586706 | 0.697000 |
| 3 | 0.582600 | 0.574838 | 0.705000 |
| 4 | 0.514200 | 0.594929 | 0.689000 |
| 6 | 0.615000 | 0.571699 | 0.708000 |
| 7 | 0.509200 | 0.565953 | 0.718000 |
| 9 | 0.591700 | 0.564030 | 0.721000 |
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn(
Evaluation of eval dataset and plot of training and validation loss¶
import matplotlib.pyplot as plt
# Initialize lists to hold the extracted values
val_losses = []
training_losses = []
epochs = []
# Track the last training loss before each eval_loss entry
last_training_loss = None
for entry in trainer.state.log_history:
# Extract training loss
# Update the last seen training loss
if 'loss' in entry:
last_training_loss = entry['loss']
# Extract eval_loss and corresponding epoch
if 'eval_loss' in entry:
val_losses.append(entry['eval_loss'])
training_losses.append(last_training_loss)
epochs.append(entry['epoch'])
# Plot training and validation loss
plt.plot(epochs, training_losses, label='Training Loss')
plt.plot(epochs, val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
# Evaluate the model on evaluation dataset
eval_results = trainer.evaluate()
# Print results
print(eval_results)
{'eval_loss': 0.5607852935791016, 'eval_accuracy': 0.721, 'eval_runtime': 217.9064, 'eval_samples_per_second': 4.589, 'eval_steps_per_second': 4.589, 'epoch': 9.99}
Evaluation of longformer based model on test data, confusion matrix and classification report¶
y_pred_probs = trainer.predict(test_dataset)
# Extract predicted labels
preds = np.argmax(y_pred_probs.predictions, axis=1)
# Compute confusion matrix
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt
cm = confusion_matrix(test_df['label'], preds)
# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()
# Print evaluation results
print(f"Evaluation results: {y_pred_probs.metrics}")
Evaluation results: {'test_loss': 0.5858533978462219, 'test_accuracy': 0.715, 'test_runtime': 217.8881, 'test_samples_per_second': 4.59, 'test_steps_per_second': 4.59}
from sklearn.metrics import classification_report # Import the missing function
# Print classification report
print(classification_report(test_df['label'], preds, target_names=['Negative', 'Positive']))
precision recall f1-score support
Negative 0.70 0.76 0.73 500
Positive 0.74 0.67 0.70 500
accuracy 0.71 1000
macro avg 0.72 0.72 0.71 1000
weighted avg 0.72 0.71 0.71 1000
Calibration report over eval_dataset¶
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np
pred_eval = trainer.predict(eval_dataset)
preds = np.argmax(pred_eval.predictions, axis=1)
# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))
# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), eval_df['label'])
# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve
# Plot calibration curve
prob_true, prob_pred = calibration_curve(eval_df['label'], probs[:, 1], n_bins=10)
plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()
Calibration by temperature scaling¶
import numpy as np
# Get predictions on the validation set
predictions = trainer.predict(encoded_dataset['validation'])
logits = predictions.predictions
labels = predictions.label_ids
# Convert logits to probabilities
probs = torch.nn.functional.softmax(torch.tensor(logits), dim=-1).numpy()
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-3-e1a1d7068d18> in <cell line: 4>() 2 3 # Get predictions on the validation set ----> 4 predictions = trainer.predict(encoded_dataset['validation']) 5 logits = predictions.predictions 6 labels = predictions.label_ids NameError: name 'trainer' is not defined
Two tower (Ensemble for prediuction) BERT and BERT¶
import numpy as np
from sklearn.model_selection import train_test_split
X = movie_reviews[['review_text', 'plot_synopsis']]
X['label'] = movie_reviews['is_spoiler'].astype(int)
# Upsampling negative samples
positive_x = X[X['label'] == 1]
negative_x = X[X['label'] == 0]
# Upsample negative samples
positive_x = positive_x.sample(n=5000, replace=False, random_state=42)
negative_x = negative_x.sample(n=5000, replace=False, random_state=42)
# Combine positive samples and upsampled negative samples
balanced_x = pd.concat([negative_x, positive_x])
# Shuffle the combined dataframe
balanced_x = balanced_x.sample(frac=1, random_state=42).reset_index(drop=True)
<ipython-input-56-9402b26fe34c>:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy X['label'] = movie_reviews['is_spoiler'].astype(int)
from sklearn.model_selection import train_test_split
# Split data into train, validation, and test sets
train_df, temp_df = train_test_split(balanced_x, test_size=0.2, random_state=42, stratify=balanced_x['label'])
eval_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42, stratify=temp_df['label'])
import json
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig
# Modify the configuration to set max_position_embeddings if needed
config = AutoConfig.from_pretrained('bert-base-uncased',
num_labels=2,
max_position_embeddings=max_length
)
# Save the modified configuration to a JSON file
config_dir = prefix_path + '/config'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'config.json')
with open(config_path, 'w') as f:
json.dump(config.to_dict(), f)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased', config=config)
# Load pre-trained model with custom configuration
model_premise =AutoModelForSequenceClassification.from_pretrained(
'bert-base-uncased',
config=config_path, # Use the path to the configuration file
cache_dir= prefix_path + '/premise_cache', # Cache directory
force_download=False, # Force download the model weights
resume_download=False, # Resume download if it was previously interrupted
proxies=None, # Proxy servers
output_loading_info=False, # Do not return loading info
local_files_only=False, # Look for local files
use_auth_token=False, # Use authentication token if needed
revision='main', # Specific model version
mirror=None, # Mirror repository URL
#gradient_checkpointing=True,
ignore_mismatched_sizes=True
)
from peft import LoraConfig, get_peft_model
# Define LoRA configuration
lora_config = LoraConfig(
r=8, # The rank of the LoRA matrix
lora_alpha=32, # Scaling factor
target_modules=["query", "value"], # Target modules for LoRA
lora_dropout=0.1, # Dropout rate
bias="none" # Bias handling
)
# Apply LoRA to the model
model_premise = get_peft_model(model_premise, lora_config)
model_hypothesis =AutoModelForSequenceClassification.from_pretrained(
'bert-base-uncased',
config=config_path, # Use the path to the configuration file
cache_dir= prefix_path + '/hypothesis-cache', # Cache directory
force_download=False, # Force download the model weights
resume_download=False, # Resume download if it was previously interrupted
proxies=None, # Proxy servers
output_loading_info=False, # Do not return loading info
local_files_only=False, # Look for local files
use_auth_token=False, # Use authentication token if needed
revision='main', # Specific model version
mirror=None, # Mirror repository URL
#gradient_checkpointing=True,
ignore_mismatched_sizes=True
)
from peft import LoraConfig, get_peft_model
# Define LoRA configuration
lora_config = LoraConfig(
r=8, # The rank of the LoRA matrix
lora_alpha=32, # Scaling factor
target_modules=["query", "value"], # Target modules for LoRA
lora_dropout=0.1, # Dropout rate
bias="none" # Bias handling
)
# Apply LoRA to the model
model_hypothesis = get_peft_model(model_hypothesis, lora_config)
/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead. warnings.warn( /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
config.json: 0%| | 0.00/570 [00:00<?, ?B/s]
model.safetensors: 0%| | 0.00/440M [00:00<?, ?B/s]
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
class SentenceDataset(Dataset):
def __init__(self, dataframe, tokenizer, max_length, is_hypothesis_index):
self.dataframe = dataframe
self.tokenizer = tokenizer
self.max_length = max_length
self.is_hypothesis_index = is_hypothesis_index
def __len__(self):
return len(self.dataframe)
def __getitem__(self, idx):
sentence = self.dataframe.iloc[idx, self.is_hypothesis_index]
label = self.dataframe.iloc[idx, 2]
inputs = self.tokenizer.encode_plus(
sentence,
add_special_tokens=True,
max_length=self.max_length,
truncation=True,
padding="max_length",
return_tensors="pt",
return_attention_mask=True,
return_token_type_ids=True,
return_overflowing_tokens=False,
return_length=True,
pad_to_max_length=True
)
inputs["labels"] = torch.tensor(label, dtype=torch.long)
return {key: val.squeeze(0) for key, val in inputs.items()}
train_dataset_premise = SentenceDataset(train_df, tokenizer, max_length, 0)
train_dataset_hypothesis = SentenceDataset(train_df, tokenizer, max_length, 1)
eval_dataset_premise = SentenceDataset(eval_df, tokenizer, max_length, 0)
eval_dataset_hypothesis = SentenceDataset(eval_df, tokenizer, max_length, 1)
test_dataset_premise = SentenceDataset(test_df, tokenizer, max_length, 0)
test_dataset_hypothesis = SentenceDataset(test_df, tokenizer, max_length, 1)
train_dataset_premise.__getitem__(0)
{'input_ids': tensor([ 101, 2748, 2023, 2003, 1037, 10225, 25318, 3185, 10303, 2045,
2003, 2070, 11749, 5171, 10225, 25318, 13764, 8649, 1998, 1037,
8552, 1998, 13749, 18595, 3560, 2466, 1012, 2021, 2242, 2003,
4394, 2182, 1010, 2926, 4102, 2000, 10225, 25318, 1005, 1055,
2010, 2060, 2147, 1012, 2364, 3291, 2003, 1045, 2228, 1996,
3494, 1012, 2027, 2024, 2204, 2035, 1998, 18080, 1999, 1037,
2126, 2021, 2027, 2024, 2025, 2428, 13432, 1012, 2926, 1996,
2364, 2839, 9901, 2829, 14089, 24665, 3771, 2003, 15640, 1012,
2016, 2003, 2025, 2428, 2844, 2438, 2004, 1037, 2839, 2000,
4287, 2023, 2972, 3185, 1012, 2613, 2364, 2839, 1997, 1996,
3185, 2000, 2033, 2001, 5212, 1048, 1012, 4027, 2021, 2339,
2821, 2339, 2106, 2017, 2018, 2000, 4929, 2008, 5976, 2559,
18865, 3520, 1029, 2839, 2008, 1045, 5621, 4669, 2001, 2728,
2139, 9152, 3217, 1005, 1055, 1012, 2002, 3248, 1037, 4066,
1997, 10916, 2535, 2066, 2017, 2052, 2025, 5987, 2013, 2032,
1999, 1037, 3185, 2066, 2023, 1012, 1996, 2717, 1997, 1996,
2839, 2024, 2025, 2428, 2844, 2030, 5875, 2438, 2000, 4287,
2023, 2972, 3185, 1012, 2428, 10473, 5783, 1997, 1996, 3185,
2003, 1996, 8235, 2135, 2517, 5896, 2008, 2320, 2153, 2003,
3561, 2007, 2070, 5171, 10225, 25318, 3787, 1998, 21438, 1012,
2748, 1010, 1996, 4566, 2453, 2018, 2042, 1037, 2978, 5410,
1998, 3383, 15640, 2021, 2053, 2009, 2106, 2025, 9868, 1996,
3185, 2005, 2033, 1012, 1045, 2145, 5632, 2023, 3185, 1010,
2009, 2074, 2347, 1005, 1056, 2004, 2204, 2004, 1045, 2018,
5113, 2005, 1012, 2009, 2001, 2785, 1997, 2066, 1996, 3185,
2041, 1997, 4356, 2011, 7112, 2061, 4063, 4059, 2232, 2029,
2038, 1037, 2785, 1997, 2714, 3857, 22264, 27983, 1998, 2466,
1010, 2009, 3310, 2200, 2485, 2012, 2335, 2021, 2009, 2196,
3243, 3138, 1996, 9850, 1012, 2145, 1037, 6669, 3422, 3085,
1010, 3262, 4569, 3185, 2021, 2145, 2017, 2024, 2488, 1997,
3666, 10225, 25318, 1005, 1055, 2060, 2147, 1012, 27671, 102,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]),
'token_type_ids': tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]),
'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]),
'length': tensor(512),
'labels': tensor(0)}
# Trainer
trainer_premise = CustomTrainer(
model=model_premise,
args=training_args,
train_dataset=train_dataset_premise,
eval_dataset=eval_dataset_premise,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer, # Pass the tokenizer here
callbacks=[early_stopping]
)
# Train the model
trainer_premise.train()
# Save the model, tokenizer, and trainer state
model_premise.save_pretrained(prefix_path + '/results_premise')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results') # Save the trained model
trainer_premise.save_state()
# Save the model and tokenizer
model_save_path = prefix_path + '/results_premise' + "/model"
tokenizer_save_path = prefix_path + '/results_premise' + "./tokenizer"
trainer_premise.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)
# Save the trainer state
trainer_premise.state.save_to_json( prefix_path + '/results_premise' + "/trainer_state.json")
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.619400 | 0.565407 | 0.716000 |
| 1 | 0.508100 | 0.572391 | 0.699000 |
| 3 | 0.476300 | 0.559408 | 0.713000 |
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.619400 | 0.565407 | 0.716000 |
| 1 | 0.508100 | 0.572391 | 0.699000 |
| 3 | 0.476300 | 0.559408 | 0.713000 |
trainer_premise.save_state()
# Save the model and tokenizer
model_save_path = prefix_path + '/results_premise' + "/model"
tokenizer_save_path = prefix_path + '/results_premise' + "./tokenizer"
trainer_premise.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)
# Save the trainer state
trainer_premise.state.save_to_json( prefix_path + '/results_premise' + "/trainer_state.json")
# Trainer
trainer_hypothesis = CustomTrainer(
model=model_premise,
args=training_args,
train_dataset=train_dataset_hypothesis,
eval_dataset=eval_dataset_hypothesis,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer, # Pass the tokenizer here
callbacks=[early_stopping]
)
# Train the model
trainer_hypothesis.train()
# Save the model, tokenizer, and trainer state
model_hypothesis.save_pretrained(prefix_path + '/results_premise')
#tokenizer.save_pretrained(prefix_path + '/results')
#trainer.save_model(prefix_path + '/results') # Save the trained model
trainer_hypothesis.save_state()
# Save the model and tokenizer
model_save_path = prefix_path + '/results_hypothesis' + "/model"
tokenizer_save_path = prefix_path + '/results_hypothesis' + "./tokenizer"
trainer_hypothesis.save_model(model_save_path)
tokenizer.save_pretrained(tokenizer_save_path)
# Save the trainer state
trainer_hypothesis.state.save_to_json( prefix_path + '/results_hypothesis' + "/trainer_state.json")
| Epoch | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 0 | 0.703500 | 0.686365 | 0.556000 |
| 1 | 0.685100 | 0.684139 | 0.557000 |
| 3 | 0.705600 | 0.681133 | 0.560000 |
| 4 | 0.698000 | 0.682720 | 0.544000 |
| 6 | 0.662500 | 0.681782 | 0.556000 |
Evaluation of the 2-tower ensemble on the test dataset, confusion matrix and clasification report¶
avg_pred
tensor([0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0,
0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0,
1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,
0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0,
0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0,
1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1,
0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0,
1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0,
1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,
0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0,
1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0,
0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0,
1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0,
0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0])
import numpy as np
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt
predictions_premise_test = trainer_premise.predict(test_dataset_premise)
predictions_hypothesis_test = trainer_hypothesis.predict(test_dataset_hypothesis)
# Extract predicted labels
preds_premise = np.argmax(predictions_premise_test.predictions, axis=1)
preds_hypothesis = np.argmax(predictions_hypothesis_test.predictions, axis=1)
# Convert NumPy arrays to PyTorch tensors
preds_premise_tensor = torch.tensor(preds_premise, dtype=torch.float32)
preds_hypothesis_tensor = torch.tensor(preds_hypothesis, dtype=torch.float32)
# Concatenate the predictions along the second dimension
cum_preds = torch.stack([preds_premise_tensor, preds_hypothesis_tensor], dim=1)
preds = cum_preds.argmax(dim = 1)
# Compute confusion matrix
conf_matrix = confusion_matrix(test_df['label'], preds)
# Calculate confidence scores (adjusting for the 1D tensor)
confidence_scores = preds.cpu().numpy() # Directly extract values from the 1D tensor
confidence_scores = np.where(preds == 0, 1 - confidence_scores, confidence_scores)
print(confidence_scores)
# Plot confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=conf_matrix, display_labels=[0, 1])
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.show()
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
from sklearn.metrics import classification_report
classification_report = classification_report(test_df['label'], preds, target_names=['Negative', 'Positive'])
print(classification_report)
precision recall f1-score support
Negative 0.45 0.61 0.52 500
Positive 0.40 0.26 0.31 500
accuracy 0.43 1000
macro avg 0.42 0.43 0.42 1000
weighted avg 0.42 0.43 0.42 1000
Calibration report over eval_dataset¶
# Platt Scaling
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import numpy as np
pred_eval_premise = trainer_premise.predict(eval_dataset_premise)
preds_premise = np.argmax(pred_eval_premise.predictions, axis=1)
pred_eval_hypothesis = trainer_hypothesis.predict(eval_dataset_hypothesis)
preds_hypothesis = np.argmax(pred_eval_hypothesis.predictions, axis=1)
preds_premise_tensor = torch.tensor(preds_premise)
preds_hypothesis_tensor = torch.tensor(preds_hypothesis)
# Concatenate the predictions along the second dimension
cum_preds = torch.stack([preds_premise_tensor, preds_hypothesis_tensor], dim=1)
preds = cum_preds.argmax(dim = 1)
# Flatten the probabilities (assuming binary classification)
probs_flat = np.reshape(preds, (-1,))
# Fit a logistic regression model for calibration
calibrator = CalibratedClassifierCV(LogisticRegression(), method='sigmoid')
calibrator.fit(probs_flat.reshape(-1, 1), eval_df['label'])
# Calibrate probabilities
calibrated_probs = calibrator.predict_proba(probs_flat.reshape(-1, 1))[:, 1]
probs = np.hstack((1 - calibrated_probs.reshape(-1, 1), calibrated_probs.reshape(-1, 1)))
import matplotlib.pyplot as plt
from sklearn.calibration import calibration_curve
# Plot calibration curve
prob_true, prob_pred = calibration_curve(eval_df['label'], probs[:, 1], n_bins=10)
plt.figure(figsize=(8, 8))
plt.plot(prob_pred, prob_true, marker='o', linestyle='--', color='b', label='Transformer based model')
plt.plot([0, 1], [0, 1], linestyle='--', color='k', label='Perfectly calibrated')
plt.xlabel('Predicted probability')
plt.ylabel('True probability')
plt.title('Calibration plot (Reliability Curve)')
plt.legend()
plt.show()