ScienceAccess issueshttps://git.qoto.org/russelljjarvis/ScienceAccess/-/issues2020-08-27T04:50:17Zhttps://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/10use one sampled t-test instead of default t-test2020-08-27T04:50:17ZRussell Jarvisuse one sampled t-test instead of default t-test*Created by: russelljjarvis*
https://www.hackdeploy.com/python-t-test-a-friendly-guide/
```python
from scipy import stats
tStat, pValue = stats.ttest_ind(a, b, equal_var = False) #run independent sample T-Test
print("P-Value:{0} T...*Created by: russelljjarvis*
https://www.hackdeploy.com/python-t-test-a-friendly-guide/
```python
from scipy import stats
tStat, pValue = stats.ttest_ind(a, b, equal_var = False) #run independent sample T-Test
print("P-Value:{0} T-Statistic:{1}".format(pValue,tStat)) #print the P-Value and the T-Statistic
```https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/9replace video in readme with Bradleys higher resolution giff.2020-08-27T04:48:09ZRussell Jarvisreplace video in readme with Bradleys higher resolution giff.*Created by: russelljjarvis*
replace video in readme with Bradleys higher resolution giff.*Created by: russelljjarvis*
replace video in readme with Bradleys higher resolution giff.https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/8Scraping Scholar not sustainable and alternative exist2020-08-22T01:26:09ZRussell JarvisScraping Scholar not sustainable and alternative exist*Created by: russelljjarvis*
Alternatives:
Cross ref.
https://libgen.is/
https://github.com/UWNETLAB/metaknowledge
https://github.com/pybliometrics-dev/pybliometrics
@mcgurrgurr *Created by: russelljjarvis*
Alternatives:
Cross ref.
https://libgen.is/
https://github.com/UWNETLAB/metaknowledge
https://github.com/pybliometrics-dev/pybliometrics
@mcgurrgurr https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/6speed up second word cloud using spacey to filter english.2020-08-18T05:09:55ZRussell Jarvisspeed up second word cloud using spacey to filter english.*Created by: russelljjarvis*
speed up second word cloud using spacey to filter english.*Created by: russelljjarvis*
speed up second word cloud using spacey to filter english.https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/5ask developers of NLP app if they want to code merge?2020-08-18T05:00:16ZRussell Jarvisask developers of NLP app if they want to code merge?*Created by: russelljjarvis*
https://github.com/Jcharis/Streamlit_DataScience_Apps/blob/master/NLP_App_with_Streamlit_Python/app.py*Created by: russelljjarvis*
https://github.com/Jcharis/Streamlit_DataScience_Apps/blob/master/NLP_App_with_Streamlit_Python/app.pyhttps://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/4readability of README issues.2020-08-26T17:18:57ZRussell Jarvisreadability of README issues.*Created by: russelljjarvis*
https://github.com/russelljjarvis/ScienceAccess/commit/6d3b478ffd4602d04514f0e64c1993a9b58ddd49
> Understanding a big word is hard, so when big ideas are written down with lots of big words, the large p...*Created by: russelljjarvis*
https://github.com/russelljjarvis/ScienceAccess/commit/6d3b478ffd4602d04514f0e64c1993a9b58ddd49
> Understanding a big word is hard, so when big ideas are written down with lots of big words, the large pile of big words is also hard to understand. We used a computer to quickly visit and read many different websites to see how hard each piece of writing was to understand. People may avoid learning hard ideas, only because there are too many hard words. We think we can help by explaining the problem with smaller words, and by creating tools to address the problem.
--the original version of that paragraph was made with the upgoer 5 editor using only 10,000 of the most common English words. It might have sounded a bit condescending but we have to write something this basic for internal consistency.
"Data-driven" is an important term for a manuscript but its very scientific and big word.
Also, I believe we need to change "[constraints on conceptual references](https://github.com/russelljjarvis/ScienceAccess/blob/master/main.py#L146
)" to just grammatically correct Nonesense.
@mcgurrgurr https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/3over scraping scholar, or filter strings being misapplied? Which is it and in...2020-08-19T03:15:18ZRussell Jarvisover scraping scholar, or filter strings being misapplied? Which is it and in any case how to handle exceptions.*Created by: russelljjarvis*
It seems now that no results return from a query.
It generates an error, but I think the error is because no results came through (convert a NaN).
![image](https://user-images.githubusercontent.com/778...*Created by: russelljjarvis*
It seems now that no results return from a query.
It generates an error, but I think the error is because no results came through (convert a NaN).
![image](https://user-images.githubusercontent.com/7786645/90455471-7e812600-e139-11ea-8173-ad6127928e2d.png)
> It seems now that no results return from a query.
But scraping still took time? If so then the scraper may be functioning correctly but the words are being thrown out in post-processing.
I suggest putting the lines:
```python
import streamlit as st
st.text(corpus)
```
at line https://github.com/russelljjarvis/ScienceAccess/blob/master/science_access/t_analysis.py#L150 to see what words are going into t_analysis.
Do it again at
https://github.com/russelljjarvis/ScienceAccess/blob/master/science_access/t_analysis.py#L217
```python
import streamlit as st
st.text(tokens)
```
To see what words were used after filtering in the analysis. You might find that one of these datatypes is empty.
If ***privacy policy*** was in the keywords then you have ever scraped. In that case, you can gracefully fall back to more robust Open Access paper searching. If privacy policy is in the first result it will be in all the results. Instead of waiting 4-5 minutes to return from scraping 15 pages of privacy-policy, you can find out straight away if ```len(ar)==0``` and if 'privacy-policy is in tokens. If that's true rather than suffering a 5minute wait to find out, the code logic should fall back at first reading of privacy policy.
You need to modify a function so it can test if scholar scraping worked and if it didn't it can set the variable ```
```
st.text("scholar scrape failed falling back to Open Access search")
OPEN_ACCESS=True
```
-- this variable needs to be changed to lower case now
https://github.com/russelljjarvis/ScienceAccess/blob/master/science_access/online_app_backend.py#L176
```python
(ar, trainingDats) = ar_manipulation(ar)
if len(ar)==0:
openaccess=True
st.text("scholar scrape failed falling back to Open Access search')
if openaccess:
import os
from crossref_commons.iteration import iterate_publications_as_json
import requests
#filter_ = {'type': 'journal-article'}
queries = {'query.author': NAME}
ar = []
bi =[p for p in iterate_publications_as_json(max_results=100, queries=queries)]
for p in bi[0:9]:
res = str('https://api.unpaywall.org/v2/')+str(p['DOI'])+str('?email=YOUR_EMAIL')
response = requests.get(res)
response = response.json()
if response['is_oa'] and response is not None:
st.text(response)
print(response.keys())
try:
temp = response['best_oa_location']['url_for_pdf']
except:
temp = response['best_oa_location']['url']#['url_for_pdf']
st.text(temp)
if temp is not None:
urlDat = process(temp)
if not isinstance(urlDat,type(None)):
ar.append(urlDat)
(ar, trainingDats) = ar_manipulation(ar)
```
@mcgurrgurr
https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/2Some ideas for improvements2020-08-18T05:02:01ZRussell JarvisSome ideas for improvements*Created by: MarcSkovMadsen*
I like your README. It is very useable. A few things could improve it.
- Please state the license and add a LICENSE file.
- I might develop a similar version in Panel one day because I like you app a...*Created by: MarcSkovMadsen*
I like your README. It is very useable. A few things could improve it.
- Please state the license and add a LICENSE file.
- I might develop a similar version in Panel one day because I like you app and I like Panel :-) But I don't know if you want or like that without a license specification.
- Please comment on `docker run` that the user should not set cpus to something higher than what is available on his/ hers machine. I only have 2 cpus so I got error message using the specified 4 cpus.
- Consider moving the .gif video a little bit higher and maybe even changing it to an embedded .mp4 video where you share you vision and the resulting app. A picture says more than a thousand words. And a video might say even more :-)
https://git.qoto.org/russelljjarvis/ScienceAccess/-/issues/1How to get Docker container working2021-06-04T03:55:19ZRussell JarvisHow to get Docker container working*Created by: MarcSkovMadsen*
Hi @russelljjarvis
Based on your mail I tried to follow the instructions on the github front page. So I git cloned, docker build and docker run the application. I could see I could not access the applica...*Created by: MarcSkovMadsen*
Hi @russelljjarvis
Based on your mail I tried to follow the instructions on the github front page. So I git cloned, docker build and docker run the application. I could see I could not access the application at localhost:8080.
In order to get that working I needed to change a few things
#### Dockerfile
```bash
RUN bash -c 'echo -e "\
[server]\n\
enableCORS = false\n\
enableXsrfProtection = false\n\
\n\
[browser]\n\
serverAddress = \"0.0.0.0\"\
" > /root/.streamlit/config.toml'
```
#### Command line
```bash
docker run -p 8080:8080 --shm-size=3gb --cpus=2.0 --memory=1g --memory-swa
p=1g --rm wcomplexity
```
You can see the changes below including the app running on localhost:8080
![image](https://user-images.githubusercontent.com/42288570/90371513-5fb65d00-e06f-11ea-9f29-3107e44ec4f8.png)