Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / alan-turing-institute/misinformation-crawler issues and pull requests

#362 - Tab delimited article input files

Issue - State: open - Opened by dongpng over 4 years ago

#361 - Allow crawler to run against specific URLs

Pull Request - State: closed - Opened by jemrobinson almost 5 years ago

#360 - Azure dependencies of the crawler

Issue - State: open - Opened by dongpng almost 5 years ago

#359 - Blockblobservice Error

Issue - State: open - Opened by dongpng almost 5 years ago

#358 - Command line tool for crawling specific URLs

Issue - State: closed - Opened by dongpng almost 5 years ago

#357 - README: Dependencies for installation

Issue - State: open - Opened by dongpng almost 5 years ago - 1 comment

#356 - Switch washingtontimes to sitemap crawl and add extra date format

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#355 - only update cookies when present

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#354 - Fix centerforsecuritypolicy

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#353 - ReadabiliPy breaking on centerforsecuritypolicy.org

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: bug

#352 - more articles from apnews with sitemap

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#351 - continue crawling index pages conservativepapers

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#350 - add extra urls weeklyworldnews

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#349 - add extra urls clickhole

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#348 - add us news and world news categories realnewsrightnow

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#347 - Check sites with low article count

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago

#346 - Reconsider button clicking strategy

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: enhancement

#345 - Missing sites from database

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#344 - add paper paragraphs

Pull Request - State: open - Opened by edwardchalstrey1 over 5 years ago

#343 - Update dailykos match rules

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#342 - Update date extraction for redstate.com

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#341 - Missing dates for some redstate.com articles

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: bug, config

#340 - update byline xpath nationalreview

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#339 - Missing byline for some nationalreview.com articles

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#338 - madpatriots.com appears to have disappeared

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#337 - eyeopening.info appears to no longer exist

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#336 - update byline xpath denverpost

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#335 - Denverpost has some articles with bylines missing

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#334 - Missing metadata

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago

#333 - Politico bylines sub-optimal

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago

#332 - Fix vox article format

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#331 - vox.com extraction issues

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: bug, config

#330 - Fix vanityfair.com

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#329 - vanityfair.com extraction issues

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: bug, config

#328 - Updated ReadabiliPy and added test for breaking page

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#327 - ReadabiliPy crash on centerforsecuritypolicy.org

Issue - State: closed - Opened by jemrobinson over 5 years ago

#326 - dailykos extraction issues

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: config

#325 - Crash when interpreting article from breitbart

Issue - State: closed - Opened by jemrobinson over 5 years ago - 1 comment
Labels: bug

#324 - Switch to updated version of ReadabiliPy with fixed Breitbart issue

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#323 - Fix button pressing

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#322 - denverpost.com using unnecessary button

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: bug

#321 - Increase output verbosity

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#320 - Fix button pressing for time.com

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#319 - Button pressing broken on time.com

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: bug

#318 - Get all article dates politico

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#317 - Politico has some articles with missing dates

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#316 - Fix missing bylines christianpost

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#315 - Some Christianpost articles missing byline

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#314 - Fix dailycaller.com

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#313 - Daily caller config needs updating

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#312 - Fix Fox News bylines and crawl

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#311 - Fox news has many bylines missing

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#310 - Switch infowars to index crawl

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#309 - Infowars.com missing many dates and bylines

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#308 - Update motherjones byline and add support for additional article type

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 4 comments

#307 - Motherjones not getting bylines

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#306 - Remove gallery pages CNN

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#305 - CNN has some articles with no byline or content

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#304 - Treat pages where no article is extracted as blank

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#303 - Fix newsweek

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#302 - Crawler not getting article content and bylines for many Newsweek articles

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#301 - Update nbcnews site config and tests

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#300 - NBC News needs to have config updated

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: config

#299 - Add even more date formats abcnews

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#298 - Use many shallow crawlers for sites with article limits

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#297 - Fix mic.com

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 4 comments

#295 - addictinginfo.com has some dates missing

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: bug

#294 - mic.com giving lots of warning messages

Issue - State: closed - Opened by jemrobinson over 5 years ago
Labels: problematic-site

#293 - Maximise correct crawl data abcnews

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#292 - abcnews has incorrect data/metadata for some articles

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: bug

#291 - Better handling of URL overrides

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#290 - Allow pages to be force-crawled

Pull Request - State: closed - Opened by jemrobinson over 5 years ago

#284 - add tampabay

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 5 comments

#273 - add thedailybeast

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 2 comments

#266 - add columbiamissourian

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: problematic-site

#265 - Cannot crawl columbiamissourian.com

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#262 - Cannot crawl hudsonstarobserver.com

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#261 - jihadwatch.org can only crawl a few articles

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#259 - add pressherald

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#257 - apnews.com needs to be crawled regularly

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: problematic-site

#256 - add kansascity

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago

#253 - add npr

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 6 comments

#252 - Cannot crawl npr.org

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: problematic-site

#249 - Only handful of sites crawled for Vanityfair

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#245 - Remove "By" and similar substrings from extracted byline strings

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: enhancement

#238 - Priority 0 sites

Issue - State: closed - Opened by jemrobinson over 5 years ago

#228 - Add hudsonstarobserver.com

Pull Request - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment

#223 - usasupreme.com

Issue - State: closed - Opened by jemrobinson over 5 years ago - 1 comment
Labels: problematic-site

#198 - Washington times index pages end at 3

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 1 comment
Labels: problematic-site

#196 - Refactor date extraction overrides to use functions from ReadabiliPy

Issue - State: closed - Opened by edwardchalstrey1 over 5 years ago - 3 comments
Labels: enhancement

#193 - Sites with no articles in Sept-Dec 2018

Issue - State: closed - Opened by dongpng over 5 years ago - 1 comment
Labels: config

#192 - Choosing the correct title

Issue - State: closed - Opened by jemrobinson over 5 years ago - 2 comments
Labels: enhancement

#189 - borowitz-report

Issue - State: open - Opened by edwardchalstrey1 over 5 years ago
Labels: problematic-site

#183 - Undersampled sites

Issue - State: closed - Opened by jemrobinson over 5 years ago - 1 comment
Labels: bug, problematic-site

#152 - Problematic site: usatoday.com

Issue - State: open - Opened by jemrobinson almost 6 years ago - 1 comment
Labels: problematic-site

#95 - Dealing with figures

Issue - State: closed - Opened by jemrobinson almost 6 years ago - 1 comment
Labels: enhancement

#77 - Do we need both "first" and "single" in the xpath matching?

Issue - State: closed - Opened by jemrobinson almost 6 years ago - 1 comment
Labels: enhancement

#35 - Add instructions about how to run crawler

Pull Request - State: closed - Opened by jemrobinson almost 6 years ago

#34 - Instructions about how to run crawler needed

Issue - State: closed - Opened by jemrobinson almost 6 years ago
Labels: enhancement