diff --git a/example_app.png b/example_app.png index be5f3cc1cefdc74832a328aa15a9d83e371c9d83..237ce0c0a2eaec9a3cc275fde6e3cba93c045e9f 100644 Binary files a/example_app.png and b/example_app.png differ diff --git a/paper.md b/paper.md new file mode 100644 index 0000000000000000000000000000000000000000..bf11652421d938cba0a61ca07eebcf158465e6f3 --- /dev/null +++ b/paper.md @@ -0,0 +1,80 @@ +title: 'A Tool for Assesing the Readability of Scientific Publications on Mass' + +tags: + - readability + - science communication + - science writing + +authors: + - name: Russell Jarvis + affiliation: PhD Candidate Neuroscience, Arizona State University + - name: Patrick McGurrin + affiliation: National Institute of Neurological Disorders and Stroke, National Institutes of Health + - name: Shivam Bansal + affiliation: Senior Data Scientist, H2O.ai + - name: Bradley G Lusk + affiliation: Science The Earth; Mesa, AZ 85201, USA + - name: Elise King + affiliation: Field Ecologist, University of Melbourne + + + +date: 20 October 2019 + +bibliography: paper.bib + +# Summary +To ensure that writing is accessible to the general population, authors must consider the length of written text, as well as sentence structure, vocabulary, and other language features [@Kutner:2006]. While popular magazines, newspapers, and other outlets purposefully cater language for a wide audience, there is a tendency for academic writing to use more complex, jargon-heavy language [@Plavén-Sigray:2017]. + +In the age of growing science communication, this tendency for scientists to use more complex language can carry over when writing in more mainstream media, such as blogs and social media. This can make public-facing material difficult to comprehend, undermining efforts to communicate scientific topics to the general public. + +While readability tools, such as Readable (https://www.webfx.com/tools/read-able/) and Upgoer5 (https://splasho.com/upgoer5/) currently exist to report on readability of text, they report the complexity of only a single document. In addition, these tools do not address complexity in a more academic-type setting. + +To address this, we created a tool that uses a data-driven approach to provide authors with insights into the readability of the entirety of their published scholarly work with regard to other text repositories. The tool first quantifies existing text repositories with varying complexity, and subsequently uses this output as a reference to show how the readability of user-selected written work compares to these other known resources. + +This tool also introduces one additional feature for readability comparison and improvement. It allows the entry of two author names to enable a competition as to whose text has the lowest average readability score. Public competitions can often incentivize good practices, and this may be a fun and interactive tool to help improve readability scores over time. + +Ultimately, this tool will expand upon current readability metrics by computing a more detailed and comparative look at the complexity of written text. We hope that this will allow scientists and other experts to better monitor the complexity of their writing relative to other text types, leading to the creation of more accessible online material. And with hope, an improved global communication and understanding of complex topics. + +# Methods + +### Text Analysis Metrics +We built a web-scraping and text analysis infrastructure by extending many existing Free and Open Source (FOS) tools, including Google Scrape, Beautiful Soup, and Selenium. + +We first query a number of available text repositories with varying complexity: + +| Text Source | Mean Complexity | Description | +|----------|----------|:-------------:| +| Upgoer 5 | 6 | library using only the 10,000 most commonly occurring English words | +| Wikipedia | 14.9 | free, popular, crowdsourced encyclopedia | +| Post-Modern Essay Generator (PMEG) | 16.5 | generates output consisting of sentences that obey the rules of written English, but without restraints on the semantic conceptual references | +| Art Corpus | 18.68 | library of scientific papers published in The Royal Society of Chemistry | + +Entering an author's name (or two authors for the competition plot) by the user begins a query through Google Scholar, returning the scraped results from articles containing the author's name(s). + +The Flesch-Kincaid readability score [@Kincaid:1975] - the most commonly used metric to assess readability - is then used to quantify the complexity of all items. + +### Reproducibility +A Docker file and associated container together serve as a self-documenting and portable software environment clone to ensure reproducibility given the hierarchy of software dependencies. + +# Output +Data are available here: [Open Science Framework data repository](https://osf.io/dashboard). + +## Contextualized Readability Output +The generated plot for contextualized readability information is a histogram binned by readability score, initially populated exclusively by the ART corpus [@Soldatova:2007] data. We use this data because it is a pre-established library of scientific papers. The readability of ART Corpus has also been shown to be comparable to that of other scientific journals [2]. + +The mean readability scores of Upgoer5 [@Kuhn:2016], Wikipedia, and PMEG [@Bulhak:1996] libraries are labeled on the plot as single data points to contextualize the complexity of the ART corpus data with other text repositories of known complexity. + +We also include mean readability scores from two scholarly reference papers, Science Declining Over Time [@Kutner:2006] and Science of Writing [@Gopen:1990], which discuss writing to a broad audience in an academic context. We use these to demonstrate the feasibility of discussing complex content using more accessible language. + +Lastly, the mean reading level of the entered author's work is displayed as a boxplot that has is shares an x-axis with the ART-corpus distribution data. The boxplot depicts mean, and the first and third quartiles of the authors specific works. The box plot enables the viewer of the report to get a sense of underlying variance in the specific authors work, relative to variance in the ART-corpus. We also display single data points for the maximum and minimum scores. Thus, the resulting graph displays the mean writing complexity of the entered author against a distribution of ART corpus content as well as these other text repositories of known complexity. + + + + +## Competition Output +The three-author competition plot displays two distributions which display the readability distribution of only the author's written work, as scraped and analyzed from Google Scholar. Vertical lines are used to plot the mean readability value for each author. Anonymous authors A and B, are co-authors that publish in the same field, thus their readability scores should be closely matched, as their score will be derived from some mutual documents. Anonymous author C, publishes in an unrelated field and does not co-author with authors A and B. + + + +# References diff --git a/search_author_vis_data.ipynb b/search_author_vis_data.ipynb index aee618cd01c4e54d08c06612903fcec0685701b5..a6a9fd023cd8d5c79131d5400995192bc3d8d164 100644 --- a/search_author_vis_data.ipynb +++ b/search_author_vis_data.ipynb @@ -16,27 +16,77 @@ { "cell_type": "code", "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "total 20344\r\n", + "-rw-r--r--@ 1 rjjarvis staff 90 Apr 16 21:54 __init__.py\r\n", + "-rw-r--r--@ 1 rjjarvis staff 2146 Apr 16 21:54 analysis.py\r\n", + "-rw-r--r--@ 1 rjjarvis staff 5332 Apr 16 21:54 authors.py\r\n", + "-rw-r--r--@ 1 rjjarvis staff 1226 Apr 16 21:54 enter_author_name.py\r\n", + "-rw-r--r--@ 1 rjjarvis staff 14106 Apr 16 21:54 for_joss_competion.py\r\n", + "drwxr-xr-x@ 7 rjjarvis staff 224 Apr 16 22:12 \u001b[34mscholar_scrape\u001b[m\u001b[m\r\n", + "-rw-r--r--@ 1 rjjarvis staff 14158 Apr 17 15:34 utils.py\r\n", + "-rw-r--r-- 1 rjjarvis staff 4036 Apr 17 18:50 CONTRIBUTING.md\r\n", + "-rw-r--r--@ 1 rjjarvis staff 1236 Apr 17 18:50 license.md\r\n", + "-rw-r--r--@ 1 rjjarvis staff 6340 Apr 17 21:00 plotting_author_versus_distribution.py\r\n", + "-rw-r--r--@ 1 rjjarvis staff 183730 Apr 17 21:34 original_distri.ipynb\r\n", + "-rw-r--r--@ 1 rjjarvis staff 12500 Apr 17 21:34 original_distri.py\r\n", + "drwxr-xr-x 4 rjjarvis staff 128 Apr 26 11:57 \u001b[34mdata\u001b[m\u001b[m\r\n", + "-rw-r--r-- 1 rjjarvis staff 11344 Apr 26 11:59 README.md\r\n", + "-rw-r--r-- 1 rjjarvis staff 681 Apr 26 11:59 gecko_install.sh\r\n", + "-rw-r--r-- 1 rjjarvis staff 5073 Apr 26 11:59 get_bmark_corpus.py\r\n", + "-rw-r--r-- 1 rjjarvis staff 978 Apr 26 11:59 install.sh\r\n", + "-rw-r--r-- 1 rjjarvis staff 50150 Apr 26 11:59 scholar.py\r\n", + "-rw-r--r-- 1 rjjarvis staff 13314 Apr 26 11:59 scrape.py\r\n", + "-rw-r--r-- 1 rjjarvis staff 5041 Apr 26 11:59 crawl.py\r\n", + "drwxr-xr-x 6 rjjarvis staff 192 Apr 26 12:00 \u001b[34mold\u001b[m\u001b[m\r\n", + "-rw-r--r-- 1 rjjarvis staff 6376 Apr 26 12:00 online_app_backend.py.orig\r\n", + "-rw-r--r-- 1 rjjarvis staff 6031 Apr 26 12:01 online_app_backend.py\r\n", + "-rw-r--r-- 1 rjjarvis staff 112 Apr 26 12:03 requirements.txt\r\n", + "-rw-r--r-- 1 rjjarvis staff 6921 Apr 26 14:20 paper.md\r\n", + "-rw-r--r-- 1 rjjarvis staff 0 Jun 18 17:32 more_authors_results.p?dl=0\r\n", + "-rw-r--r-- 1 rjjarvis staff 68232 Jun 18 17:32 benchmarks.p\r\n", + "-rw-r--r-- 1 rjjarvis staff 35381 Jun 18 17:49 geckodriver.log\r\n", + "-rw-r--r-- 1 rjjarvis staff 5165 Jun 18 17:49 _author_specific.p\r\n", + "-rw-r--r-- 1 rjjarvis staff 9822110 Jun 18 17:49 traingDats.p\r\n", + "drwxr-xr-x 10 rjjarvis staff 320 Jun 18 17:49 \u001b[34m__pycache__\u001b[m\u001b[m\r\n", + "-rw-r--r-- 1 rjjarvis staff 58107 Jun 18 17:51 search_author_vis_data.ipynb\r\n", + "-rw-r--r--@ 1 rjjarvis staff 9629 Jun 18 17:52 t_analysis.py\r\n" + ] + } + ], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { - "name": "stderr", + "name": "stdout", "output_type": "stream", "text": [ - "[nltk_data] Downloading package punkt to /home/user/nltk_data...\n", - "[nltk_data] Package punkt is already up-to-date!\n" + "[nltk_data] Downloading package punkt to /Users/rjjarvis/nltk_data...\n", + "[nltk_data] Package punkt is already up-to-date!\n", + "[nltk_data] Downloading package stopwords to\n", + "[nltk_data] /Users/rjjarvis/nltk_data...\n", + "[nltk_data] Package stopwords is already up-to-date!\n", + "mv: traingDats.p?dl=0: No such file or directory\n", + "mv: benchmarks.p?dl=0: No such file or directory\n" ] - }, - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" } ], "source": [ @@ -44,18 +94,60 @@ "import nltk\n", "nltk.download('punkt')\n", "import nltk\n", - "nltk.download('stopwords')" + "nltk.download('stopwords')\n", + "try:\n", + " !mv traingDats.p?dl=0 traingDats.p\n", + " !mv benchmarks.p?dl=0 benchmarks.p\n", + "except:\n", + " pass\n", + "\n" ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "from online_app_backend import call_from_front_end\n", + "from online_app_backend import ar_manipulation\n", + "\n", + "import streamlit as st\n", + "import pandas as pd\n", + "#from sklearn import datasets\n", + "#from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "st.write(\"\"\"\n", + " Please Enter the scholar Author you would like to search for\n", + "\"\"\")\n", + "\n", + "st.sidebar.header('User Input Parameters')\n", + "\n", + "def user_input_features():\n", + " sepal_length = st.sidebar.slider('Sepal length', 4.3, 7.9, 5.4)\n", + " sepal_width = st.sidebar.slider('Sepal width', 2.0, 4.4, 3.4)\n", + " petal_length = st.sidebar.slider('Petal length', 1.0, 6.9, 1.3)\n", + " petal_width = st.sidebar.slider('Petal width', 0.1, 2.5, 0.2)\n", + " data = {'sepal_length': sepal_length,\n", + " 'sepal_width': sepal_width,\n", + " 'petal_length': petal_length,\n", + " 'petal_width': petal_width}\n", + " features = pd.DataFrame(data, index=[0])\n", + " return features\n", + "\n", + "df = user_input_features()" + ] + }, + { + "cell_type": "code", + "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "4f39b8d7fcec4f88919c2b40e18cf32b", + "model_id": "2c0e5f0e67e342bbac0d3bcb32931243", "version_major": 2, "version_minor": 0 }, @@ -74,29 +166,109 @@ ] }, { - "ename": "LookupError", - "evalue": "\n**********************************************************************\n Resource \u001b[93mstopwords\u001b[0m not found.\n Please use the NLTK Downloader to obtain the resource:\n\n \u001b[31m>>> import nltk\n >>> nltk.download('stopwords')\n \u001b[0m\n For more information see: https://www.nltk.org/data.html\n\n Attempted to load \u001b[93mcorpora/stopwords\u001b[0m\n\n Searched in:\n - '/home/user/nltk_data'\n - '/home/user/anaconda3/nltk_data'\n - '/home/user/anaconda3/share/nltk_data'\n - '/home/user/anaconda3/lib/nltk_data'\n - '/usr/share/nltk_data'\n - '/usr/local/share/nltk_data'\n - '/usr/lib/nltk_data'\n - '/usr/local/lib/nltk_data'\n**********************************************************************\n", + "name": "stderr", + "output_type": "stream", + "text": [ + "/anaconda3/lib/python3.6/site-packages/numpy/core/fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.\n", + " out=out, **kwargs)\n", + "/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars\n", + " ret = ret.dtype.type(ret / rcount)\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "nan\n", + "Empty DataFrame\n", + "Columns: []\n", + "Index: []\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/rjjarvis/git/wcomplexity_dash/plotting_author_versus_distribution.py:6: UserWarning: \n", + "This call to matplotlib.use() has no effect because the backend has already\n", + "been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,\n", + "or matplotlib.backends is imported for the first time.\n", + "\n", + "The backend was *originally* set to 'module://ipykernel.pylab.backend_inline' by the following code:\n", + " File \"/anaconda3/lib/python3.6/runpy.py\", line 193, in _run_module_as_main\n", + " \"__main__\", mod_spec)\n", + " File \"/anaconda3/lib/python3.6/runpy.py\", line 85, in _run_code\n", + " exec(code, run_globals)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py\", line 16, in <module>\n", + " app.launch_new_instance()\n", + " File \"/anaconda3/lib/python3.6/site-packages/traitlets/config/application.py\", line 658, in launch_instance\n", + " app.start()\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/kernelapp.py\", line 597, in start\n", + " self.io_loop.start()\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/platform/asyncio.py\", line 127, in start\n", + " self.asyncio_loop.run_forever()\n", + " File \"/anaconda3/lib/python3.6/asyncio/base_events.py\", line 422, in run_forever\n", + " self._run_once()\n", + " File \"/anaconda3/lib/python3.6/asyncio/base_events.py\", line 1432, in _run_once\n", + " handle._run()\n", + " File \"/anaconda3/lib/python3.6/asyncio/events.py\", line 145, in _run\n", + " self._callback(*self._args)\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/ioloop.py\", line 759, in _run_callback\n", + " ret = callback()\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/stack_context.py\", line 276, in null_wrapper\n", + " return fn(*args, **kwargs)\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/gen.py\", line 1199, in inner\n", + " self.run()\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/gen.py\", line 1113, in run\n", + " yielded = self.gen.send(value)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py\", line 365, in process_one\n", + " yield gen.maybe_future(dispatch(*args))\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/gen.py\", line 315, in wrapper\n", + " yielded = next(result)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py\", line 268, in dispatch_shell\n", + " yield gen.maybe_future(handler(stream, idents, msg))\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/gen.py\", line 315, in wrapper\n", + " yielded = next(result)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/kernelbase.py\", line 545, in execute_request\n", + " user_expressions, allow_stdin,\n", + " File \"/anaconda3/lib/python3.6/site-packages/tornado/gen.py\", line 315, in wrapper\n", + " yielded = next(result)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/ipkernel.py\", line 300, in do_execute\n", + " res = shell.run_cell(code, store_history=store_history, silent=silent)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/zmqshell.py\", line 536, in run_cell\n", + " return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)\n", + " File \"/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py\", line 2666, in run_cell\n", + " self.events.trigger('post_run_cell', result)\n", + " File \"/anaconda3/lib/python3.6/site-packages/IPython/core/events.py\", line 88, in trigger\n", + " func(*args, **kwargs)\n", + " File \"/anaconda3/lib/python3.6/site-packages/ipykernel/pylab/backend_inline.py\", line 168, in configure_once\n", + " activate_matplotlib(backend)\n", + " File \"/anaconda3/lib/python3.6/site-packages/IPython/core/pylabtools.py\", line 311, in activate_matplotlib\n", + " matplotlib.pyplot.switch_backend(backend)\n", + " File \"/anaconda3/lib/python3.6/site-packages/matplotlib/pyplot.py\", line 231, in switch_backend\n", + " matplotlib.use(newbackend, warn=False, force=True)\n", + " File \"/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py\", line 1410, in use\n", + " reload(sys.modules['matplotlib.backends'])\n", + " File \"/anaconda3/lib/python3.6/importlib/__init__.py\", line 166, in reload\n", + " _bootstrap._exec(spec, module)\n", + " File \"/anaconda3/lib/python3.6/site-packages/matplotlib/backends/__init__.py\", line 16, in <module>\n", + " line for line in traceback.format_stack()\n", + "\n", + "\n", + " mpl.use(\"Agg\")\n" + ] + }, + { + "ename": "FileNotFoundError", + "evalue": "[Errno 2] No such file or directory: 'more_authors_results.p'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mLookupError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py\u001b[0m in \u001b[0;36m__load\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 85\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 86\u001b[0;31m \u001b[0mroot\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnltk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfind\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'{}/{}'\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msubdir\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mzip_name\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 87\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mLookupError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/data.py\u001b[0m in \u001b[0;36mfind\u001b[0;34m(resource_name, paths)\u001b[0m\n\u001b[1;32m 700\u001b[0m \u001b[0mresource_not_found\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'\\n%s\\n%s\\n%s\\n'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0msep\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmsg\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msep\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 701\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mLookupError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresource_not_found\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 702\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mLookupError\u001b[0m: \n**********************************************************************\n Resource \u001b[93mstopwords\u001b[0m not found.\n Please use the NLTK Downloader to obtain the resource:\n\n \u001b[31m>>> import nltk\n >>> nltk.download('stopwords')\n \u001b[0m\n For more information see: https://www.nltk.org/data.html\n\n Attempted to load \u001b[93mcorpora/stopwords.zip/stopwords/\u001b[0m\n\n Searched in:\n - '/home/user/nltk_data'\n - '/home/user/anaconda3/nltk_data'\n - '/home/user/anaconda3/share/nltk_data'\n - '/home/user/anaconda3/lib/nltk_data'\n - '/usr/share/nltk_data'\n - '/usr/local/share/nltk_data'\n - '/usr/lib/nltk_data'\n - '/usr/local/lib/nltk_data'\n**********************************************************************\n", - "\nDuring handling of the above exception, another exception occurred:\n", - "\u001b[0;31mLookupError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m<ipython-input-2-ce0a7dbd195b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 22\u001b[0m \u001b[0mdisplay\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0myear_input\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 24\u001b[0;31m \u001b[0mresults\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcall_from_front_end\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0myear_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 25\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresults\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/git/ScienceAccess/online_app_backend.py\u001b[0m in \u001b[0;36mcall_from_front_end\u001b[0;34m(NAME, tour, NAME1, verbose)\u001b[0m\n\u001b[1;32m 163\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtour\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 164\u001b[0m \u001b[0mscholar_link\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'https://scholar.google.com/scholar?hl=en&as_sdt=0%2C3&q='\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mNAME\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 165\u001b[0;31m \u001b[0mdf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdatay\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mar\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0menter_name_here\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscholar_link\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mNAME\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 166\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 167\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'_author_specific'\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mNAME\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0;34m'.p'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m'wb'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mpickle\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdump\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mNAME\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mar\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mdf\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mdatay\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mscholar_link\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/git/ScienceAccess/online_app_backend.py\u001b[0m in \u001b[0;36menter_name_here\u001b[0;34m(scholar_page, name)\u001b[0m\n\u001b[1;32m 129\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 130\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0menter_name_here\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscholar_page\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 131\u001b[0;31m \u001b[0mdf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdatay\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mauthor_results\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mupdate_web_form\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscholar_page\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 132\u001b[0m \u001b[0;31m#author_results\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 133\u001b[0m '''\n", - "\u001b[0;32m~/git/ScienceAccess/online_app_backend.py\u001b[0m in \u001b[0;36mupdate_web_form\u001b[0;34m(url)\u001b[0m\n\u001b[1;32m 114\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0murl\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 115\u001b[0m \u001b[0;31m#data = author_results = {}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 116\u001b[0;31m \u001b[0mauthor_results\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtake_url_from_gui\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0murl\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 117\u001b[0m \u001b[0mar\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcopy\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcopy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mauthor_results\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 118\u001b[0m \u001b[0;31m#data[name] = author_results\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/git/ScienceAccess/online_app_backend.py\u001b[0m in \u001b[0;36mtake_url_from_gui\u001b[0;34m(author_link_scholar_link_list)\u001b[0m\n\u001b[1;32m 61\u001b[0m \u001b[0mfollow_more_links\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcollect_pubs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 62\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mfollow_more_links\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 63\u001b[0;31m \u001b[0murlDat\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mprocess\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 64\u001b[0m \u001b[0;31m#print(urlDat)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 65\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/git/ScienceAccess/get_bmark_corpus.py\u001b[0m in \u001b[0;36mprocess\u001b[0;34m(link)\u001b[0m\n\u001b[1;32m 43\u001b[0m \u001b[0mpdf_file\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrequests\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlink\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstream\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 44\u001b[0m \u001b[0mbuffered\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mconvert_pdf_to_txt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpdf_file\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 45\u001b[0;31m \u001b[0murlDat\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtext_proc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbuffered\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0murlDat\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 46\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0murlDat\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 47\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/git/ScienceAccess/t_analysis.py\u001b[0m in \u001b[0;36mtext_proc\u001b[0;34m(corpus, urlDat, WORD_LIM)\u001b[0m\n\u001b[1;32m 139\u001b[0m \u001b[0mtokens\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mword_tokenize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcorpus\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 140\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 141\u001b[0;31m \u001b[0mstop_words\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mstopwords\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwords\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'english'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 142\u001b[0m \u001b[0;31m#We create a list comprehension which only returns a list of words #that are NOT IN stop_words and NOT IN punctuations.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 143\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py\u001b[0m in \u001b[0;36m__getattr__\u001b[0;34m(self, attr)\u001b[0m\n\u001b[1;32m 121\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mAttributeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"LazyCorpusLoader object has no attribute '__bases__'\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 122\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 123\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__load\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 124\u001b[0m \u001b[0;31m# This looks circular, but its not, since __load() changes our\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 125\u001b[0m \u001b[0;31m# __class__ to something new:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py\u001b[0m in \u001b[0;36m__load\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 86\u001b[0m \u001b[0mroot\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnltk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfind\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'{}/{}'\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msubdir\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mzip_name\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 87\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mLookupError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 88\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 89\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 90\u001b[0m \u001b[0;31m# Load the corpus.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py\u001b[0m in \u001b[0;36m__load\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 81\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 82\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 83\u001b[0;31m \u001b[0mroot\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnltk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfind\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'{}/{}'\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msubdir\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__name\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 84\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mLookupError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 85\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/anaconda3/lib/python3.7/site-packages/nltk/data.py\u001b[0m in \u001b[0;36mfind\u001b[0;34m(resource_name, paths)\u001b[0m\n\u001b[1;32m 699\u001b[0m \u001b[0msep\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'*'\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0;36m70\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 700\u001b[0m \u001b[0mresource_not_found\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'\\n%s\\n%s\\n%s\\n'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0msep\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmsg\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msep\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 701\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mLookupError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresource_not_found\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 702\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 703\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mLookupError\u001b[0m: \n**********************************************************************\n Resource \u001b[93mstopwords\u001b[0m not found.\n Please use the NLTK Downloader to obtain the resource:\n\n \u001b[31m>>> import nltk\n >>> nltk.download('stopwords')\n \u001b[0m\n For more information see: https://www.nltk.org/data.html\n\n Attempted to load \u001b[93mcorpora/stopwords\u001b[0m\n\n Searched in:\n - '/home/user/nltk_data'\n - '/home/user/anaconda3/nltk_data'\n - '/home/user/anaconda3/share/nltk_data'\n - '/home/user/anaconda3/lib/nltk_data'\n - '/usr/share/nltk_data'\n - '/usr/local/share/nltk_data'\n - '/usr/lib/nltk_data'\n - '/usr/local/lib/nltk_data'\n**********************************************************************\n" + "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m<ipython-input-5-ce0a7dbd195b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 22\u001b[0m \u001b[0mdisplay\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0myear_input\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 24\u001b[0;31m \u001b[0mresults\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcall_from_front_end\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0myear_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 25\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresults\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/git/wcomplexity_dash/online_app_backend.py\u001b[0m in \u001b[0;36mcall_from_front_end\u001b[0;34m(NAME, tour, NAME1, verbose)\u001b[0m\n\u001b[1;32m 172\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'traingDats.p'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m'wb'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 173\u001b[0m \u001b[0mpickle\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdump\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrainingDats\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 174\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mplotting_author_versus_distribution\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 175\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mar\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 176\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/git/wcomplexity_dash/plotting_author_versus_distribution.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 15\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0mbmark\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpickle\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'benchmarks.p'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m'rb'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 17\u001b[0;31m \u001b[0mNAME\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mar\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpickle\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'more_authors_results.p'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m'rb'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 18\u001b[0m \u001b[0;31m#print(ar)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 19\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbmark\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'more_authors_results.p'" ] } ], @@ -187,7 +359,7 @@ "group_labels = ['Biochemistry Documents']#, 'Group 2', 'Group 3']\n", "colors = ['#393E46']#, '#2BCDC1', '#F66095']\n", "\n", - "fig = ff.create_distplot([standard_sci], group_labelad aas, colors=colors,\n", + "fig = ff.create_distplot([standard_sci], group_labels, colors=colors,\n", " bin_size=[0.3, 0.2, 0.1], show_curve=True)\n", "\n", "fig.update(layout_title_text='Art Corpus')\n", @@ -827,7 +999,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.4" + "version": "3.6.5" } }, "nbformat": 4,