Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / fhamborg/news-please issues and pull requests
#180 - Add language filter for commoncrawler
Pull Request -
State: closed - Opened by AlviseSembenico about 4 years ago
- 1 comment
#179 - Heuristic url
Pull Request -
State: closed - Opened by AlviseSembenico about 4 years ago
- 1 comment
#178 - article.date_modify returns 'None' despite the article having a modified date
Issue -
State: closed - Opened by Anacoder1 about 4 years ago
- 3 comments
#177 - DateFilter not working when using CLI
Issue -
State: closed - Opened by benjamin-kraatz about 4 years ago
- 3 comments
#176 - DateFilters are not respected from config.cfg file
Issue -
State: closed - Opened by basingh over 4 years ago
#175 - Finished crawling with no results
Issue -
State: open - Opened by tobiasstrauss over 4 years ago
- 13 comments
#174 - RecursiveCrawler : ValueError('Missing scheme in request url: %s' % self._url)
Issue -
State: closed - Opened by basingh over 4 years ago
- 4 comments
#173 - fixes #172 and #169: NewsPlease.from_urls() - use multiprocessing
Pull Request -
State: closed - Opened by arcolife over 4 years ago
- 17 comments
#172 - NewsPlease.from_urls() could use multiprocessing
Issue -
State: closed - Opened by arcolife over 4 years ago
#171 - fixes #170: custom headers for requests
Pull Request -
State: closed - Opened by arcolife over 4 years ago
#170 - customized HEADERS are sometimes problematic
Issue -
State: closed - Opened by arcolife over 4 years ago
#169 - handle ArticleExtractor error on empty html returns
Issue -
State: closed - Opened by arcolife over 4 years ago
- 1 comment
#168 - how to download warc files between specific dates
Issue -
State: closed - Opened by Prateek-Tyagi over 4 years ago
- 1 comment
#167 - Verbose logging of exceptions if continue_after_error
Pull Request -
State: closed - Opened by sebastian-nagel over 4 years ago
- 1 comment
#166 - error in executing commoncrawl.py
Issue -
State: closed - Opened by Prateek-Tyagi over 4 years ago
- 16 comments
#165 - Amjltc295/use str html instead of bytes html to speed up
Pull Request -
State: closed - Opened by amjltc295 over 4 years ago
- 3 comments
#164 - Using bytes HTML is significantly slower than str HTML for parsing content
Issue -
State: closed - Opened by amjltc295 over 4 years ago
- 3 comments
#163 - [Error Crawling March data][[newsplease.crawler.commoncrawl_extractor] ERROR: Unexpected error: <class 'PermissionError'>]
Issue -
State: closed - Opened by sara-02 over 4 years ago
- 5 comments
#162 - Crawl Specific RSS Feeds on NYTimes
Issue -
State: closed - Opened by ericagredo over 4 years ago
- 6 comments
#161 - problem with 'mailto' links
Issue -
State: closed - Opened by nicolabertoldi over 4 years ago
- 10 comments
#160 - #159 Add missing hurry.filesize to requirements.txt
Pull Request -
State: closed - Opened by petlack over 4 years ago
- 1 comment
#159 - ModuleNotFoundError: No module named 'hurry'
Issue -
State: closed - Opened by petlack over 4 years ago
#158 - my_delete_warc_after_extraction
Pull Request -
State: closed - Opened by tbrknt over 4 years ago
- 1 comment
#157 - Less OS Dependency
Pull Request -
State: closed - Opened by tbrknt over 4 years ago
#156 - Add exitcode check for Subprocesses
Pull Request -
State: closed - Opened by tbrknt over 4 years ago
- 1 comment
#155 - Windows Compatability and Subprocess Check
Pull Request -
State: closed - Opened by tbrknt over 4 years ago
- 1 comment
#154 - rename text to maintext consistently
Issue -
State: open - Opened by fhamborg over 4 years ago
Labels: help wanted
#153 - language filtering
Issue -
State: closed - Opened by lalimili6 over 4 years ago
- 1 comment
#152 - Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER
Issue -
State: closed - Opened by lalimili6 over 4 years ago
- 1 comment
#151 - Javascript is disabled on your browser
Issue -
State: closed - Opened by lalimili6 over 4 years ago
- 1 comment
#150 - Javascript is disabled on your browser AND Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
Issue -
State: closed - Opened by lalimili6 over 4 years ago
- 1 comment
#149 - crawl comments
Issue -
State: closed - Opened by lalimili6 over 4 years ago
- 1 comment
#148 - Fixed broken date extraction due to beautiful soup's tag.text.
Pull Request -
State: closed - Opened by thihara over 4 years ago
- 3 comments
#147 - Add library interface to scrape multiple articles from domain url
Pull Request -
State: closed - Opened by mrknight21 over 4 years ago
- 4 comments
#146 - add date filter for commoncrawl warc files
Pull Request -
State: closed - Opened by moyid over 4 years ago
- 4 comments
#145 - Changes to make Scrapy Item class customizable via configuration
Pull Request -
State: closed - Opened by thihara over 4 years ago
- 11 comments
#144 - Stopped the LOG_ENABLED variable from being unset
Pull Request -
State: closed - Opened by thihara over 4 years ago
- 2 comments
#142 - CommonCrawl: Start and End Date Not Working
Issue -
State: closed - Opened by ozgurakyazi over 4 years ago
- 10 comments
#141 - psycopg2 issue on macOS
Issue -
State: closed - Opened by hellc over 4 years ago
- 5 comments
#133 - article.text returns None on english article
Issue -
State: closed - Opened by ysig about 5 years ago
- 8 comments
#130 - ElasticsearchStorage can't save scraped files
Issue -
State: closed - Opened by JeromeGill about 5 years ago
- 6 comments
Labels: help wanted
#129 - RssCrawler doesn't support valid Rss XML
Issue -
State: closed - Opened by JeromeGill about 5 years ago
- 2 comments
Labels: help wanted
#101 - filter articles for keywords
Issue -
State: open - Opened by fhamborg over 5 years ago
- 7 comments
Labels: help wanted
#94 - Can I crawl a root site?
Issue -
State: closed - Opened by truenodeverano over 5 years ago
- 6 comments
#88 - Issue #54
Issue -
State: closed - Opened by aamin3 over 5 years ago
- 3 comments
#61 - To str convertion of the datetime fields
Issue -
State: closed - Opened by anastasia-zhukova over 6 years ago
- 5 comments
#12 - get order on main page
Issue -
State: closed - Opened by fhamborg almost 8 years ago
Labels: help wanted
#8 - Merge articles spread on multiple pages
Issue -
State: closed - Opened by fhamborg almost 8 years ago
- 2 comments
Labels: help wanted