Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / algolia/docsearch-scraper issues and pull requests

#576 - Getting Unreachable hosts error when trying to scrape data

Issue - State: open - Opened by beeena over 1 year ago

#575 - DocSearch: 0 records

Issue - State: open - Opened by xsf0105 over 1 year ago

#574 - unable to run the scraper on local url

Issue - State: open - Opened by rubai99 over 1 year ago

#573 - Algolia search breaks after running subsequent scrapes

Issue - State: closed - Opened by fredmaggiowski over 1 year ago - 2 comments

#572 - Ignore sidebar headings in algolia indexing

Issue - State: open - Opened by cybersaksham almost 2 years ago

#571 - Getitng only 1 NB hit while running from docker

Issue - State: closed - Opened by cybersaksham almost 2 years ago - 3 comments

#570 - Python error when running own scrapper

Issue - State: open - Opened by pmenichelli almost 2 years ago - 1 comment

#569 - Docker operation error. Procedure

Issue - State: closed - Opened by pptfz about 2 years ago - 2 comments

#568 - Need help with creating a Docker Compose file

Issue - State: closed - Opened by deepfriedbrain over 2 years ago - 5 comments

#567 - Getting ValueError: CONFIG is not a valid JSON

Issue - State: closed - Opened by KaranS-hexaware over 2 years ago - 6 comments

#566 - chore(repo): archive

Pull Request - State: closed - Opened by shortcuts almost 3 years ago

#564 - fix(deployer): update deploy message

Pull Request - State: closed - Opened by shortcuts about 3 years ago

#563 - fix(deployer): update deploy message

Pull Request - State: closed - Opened by shortcuts about 3 years ago

#562 - README: fix link to running legacy docsearch-scraper

Pull Request - State: closed - Opened by orbeckst about 3 years ago

#561 - Chrome not reachable

Issue - State: closed - Opened by AnthonyGWeb about 3 years ago - 2 comments

#560 - Algolia index records reduction after an undefined amount of time

Issue - State: open - Opened by FilippoRezzonico about 3 years ago - 6 comments

#559 - Custom downloader support ShadowDOM

Pull Request - State: open - Opened by mantou132 about 3 years ago - 2 comments

#558 - Docker image to support ARM64 Platform

Issue - State: open - Opened by marksou over 3 years ago - 7 comments
Labels: need fix

#557 - Index updated but results not showing up

Issue - State: closed - Opened by ArthurFlageul over 3 years ago - 4 comments

#556 - concurrency settings

Issue - State: open - Opened by davidejones over 3 years ago - 1 comment

#555 - feat(circle-ci): replace `travis-ci` by `circle-ci`

Pull Request - State: closed - Opened by shortcuts over 3 years ago

#554 - Bump chromedriver version to 91.0.4472.101

Pull Request - State: closed - Opened by shortcuts over 3 years ago

#552 - Crawler does not seem to work on websites that use shadowDOM

Issue - State: open - Opened by mantou132 over 3 years ago - 3 comments

#551 - TypeError: argument of type 'NoneType' is not iterable

Issue - State: open - Opened by christophemenager over 3 years ago - 2 comments

#550 - How to de-dup pages?

Issue - State: closed - Opened by lorensr over 3 years ago - 1 comment

#549 - Crawler isn't following links

Issue - State: closed - Opened by lorensr over 3 years ago - 5 comments

#548 - Add content snippet for lvl-n heading match

Issue - State: open - Opened by mojavelinux over 3 years ago - 1 comment
Labels: enhancement

#547 - An error was encountered while executing `./docsearch bootstrap`.

Issue - State: closed - Opened by Yue-plus over 3 years ago - 6 comments

#546 - Github action for docsearch scraper

Issue - State: open - Opened by milindsingh over 3 years ago

#545 - Record quota exceeded at 5k instead of 10k

Issue - State: closed - Opened by PierreR over 3 years ago - 1 comment

#544 - upgrade algoliasearch python client from 1.x to 2.x

Pull Request - State: closed - Opened by shortcuts over 3 years ago - 3 comments
Labels: enhancement

#543 - Docsearch doesn't update index correctly

Issue - State: closed - Opened by ArthurFlageul over 3 years ago - 7 comments

#542 - feat: check if `article` or `main` is available for FIXME configs

Pull Request - State: closed - Opened by shortcuts almost 4 years ago

#541 - feat: check sitemap status code endpoint to improve config generation

Pull Request - State: closed - Opened by shortcuts almost 4 years ago

#539 - Allow custom ports for `start_urls`

Issue - State: open - Opened by vjpr almost 4 years ago

#538 - Use draft instead of notes to deploy faster

Pull Request - State: closed - Opened by shortcuts almost 4 years ago

#537 - Update: use less layers in Docker image

Pull Request - State: closed - Opened by ArtFlag almost 4 years ago

#536 - fix docusaurus v2 missing attributesForFaceting and wrong lvl0

Pull Request - State: closed - Opened by shortcuts almost 4 years ago

#535 - refactor(Dockerfile): use latest stable chrome with matching driver

Pull Request - State: open - Opened by rubda almost 4 years ago

#534 - Added auth cookie support in .env

Pull Request - State: open - Opened by canyonrobins about 4 years ago

#533 - Jeff bishop sent this to me

Issue - State: closed - Opened by JOHNMDAY-CREATE about 4 years ago

#532 - Fosde

Pull Request - State: closed - Opened by JOHNMDAY-CREATE about 4 years ago - 1 comment

#531 - <tags custom_settings.attributesForFaceting> tags<attributesForFaceting>

Issue - State: closed - Opened by JOHNMDAY-CREATE about 4 years ago - 1 comment

#530 - Fix config_exists check

Pull Request - State: closed - Opened by robertmogos about 4 years ago

#529 - feat: Add lastCrawl to userData for crawl

Pull Request - State: closed - Opened by robertmogos about 4 years ago

#527 - fix: add config_exists to the command in order to add a new api key

Pull Request - State: closed - Opened by robertmogos about 4 years ago

#526 - Fix/deploy

Pull Request - State: closed - Opened by robertmogos about 4 years ago

#525 - deps: upgrade headless chrome to stable 85

Pull Request - State: closed - Opened by robertmogos about 4 years ago

#524 - feat(meta): handle comma-separated version

Pull Request - State: closed - Opened by s-pace about 4 years ago - 5 comments
Labels: enhancement

#523 - refactor(tests): remove superfluous selectors overriding

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: enhancement

#522 - test: add test on empty text element

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: enhancement

#521 - feat(cli): update docusaurus v2 config template to welcome DocSearch v3

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: enhancement, cli

#520 - docs(readme): update related projects with website repo

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: documentation

#519 - Update yugbyte.json with new start_urls and selectors

Pull Request - State: closed - Opened by stevebang about 4 years ago

#518 - refactor(dockerfile): remove deprecated instructions

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: enhancement

#517 - generator(vuepress): use correct naming of scrape_start_urls

Pull Request - State: closed - Opened by s-pace about 4 years ago
Labels: enhancement

#516 - deps: upgrade Scrapy + Chrome to stable 84

Pull Request - State: closed - Opened by s-pace over 4 years ago
Labels: enhancement

#515 - style: prevent redirections

Pull Request - State: closed - Opened by coliff over 4 years ago - 1 comment

#514 - Unable to find image 'article:latest'

Issue - State: closed - Opened by martavis over 4 years ago - 1 comment

#513 - CONFIG is not a valid JSON

Issue - State: closed - Opened by martavis over 4 years ago - 1 comment

#512 - Scraping is expensive. Batch entries to reduce operations?

Issue - State: closed - Opened by KingScooty over 4 years ago - 1 comment

#511 - How to handle too big entries?

Issue - State: closed - Opened by CodeSandwich over 4 years ago - 1 comment

#510 - How to test run a scraper without using up plan?

Issue - State: closed - Opened by CodeSandwich over 4 years ago - 1 comment

#509 - feat: update chrome to 83.0.4103.61

Pull Request - State: closed - Opened by s-pace over 4 years ago
Labels: enhancement

#508 - docker:build fails

Issue - State: closed - Opened by CodeSandwich over 4 years ago - 1 comment

#507 - run and docker:run commands don't exist

Issue - State: closed - Opened by CodeSandwich over 4 years ago - 7 comments

#506 - feat(meta): do not jsonized version meta

Pull Request - State: closed - Opened by s-pace over 4 years ago - 1 comment
Labels: enhancement

#505 - feat: remove no linking anchor

Pull Request - State: closed - Opened by s-pace over 4 years ago
Labels: enhancement

#504 - feat: avoid indexing element without current_level set

Pull Request - State: closed - Opened by s-pace over 4 years ago - 1 comment
Labels: enhancement

#503 - Meilisearch

Pull Request - State: closed - Opened by curquiza over 4 years ago

#502 - refactor(cli): remove deprecated config settings

Pull Request - State: closed - Opened by s-pace almost 5 years ago
Labels: enhancement, cli

#501 - refactor: streamline and document environment variables

Pull Request - State: closed - Opened by s-pace almost 5 years ago
Labels: enhancement, documentation, cli

#500 - feat(headless_chrome): use google chrome 78

Pull Request - State: closed - Opened by s-pace almost 5 years ago
Labels: enhancement

#499 - doc(contributing): support for HTTP basic auth

Pull Request - State: closed - Opened by s-pace about 5 years ago
Labels: documentation

#498 - Add support for HTTP Basic Auth

Pull Request - State: closed - Opened by radusuciu about 5 years ago - 2 comments

#497 - config_creator: update vuepress template & docusaurus

Pull Request - State: closed - Opened by s-pace about 5 years ago

#496 - fix(config_creator): Use newest helpscout client version

Pull Request - State: closed - Opened by s-pace about 5 years ago
Labels: bug, enhancement, cli

#495 - gitgnore: add IDE visual code & script that might contains sensitive data

Pull Request - State: closed - Opened by s-pace about 5 years ago
Labels: enhancement

#494 - docker: add .dockerignore to prevent sharing sensitive data or useless files into the docker image

Pull Request - State: closed - Opened by s-pace about 5 years ago
Labels: enhancement

#493 - helpdesk: fix connection to helpscout

Pull Request - State: closed - Opened by s-pace about 5 years ago - 1 comment
Labels: bug, enhancement

#492 - Track Last Index Time

Issue - State: open - Opened by GoPro16 about 5 years ago
Labels: enhancement

#491 - Error crawling from docker image, invalid JSON

Issue - State: closed - Opened by jacknewwl about 5 years ago - 2 comments
Labels: bug, need_fix_test, documentation

#490 - [core] Setting attributesForFaceting from the config overrides Algolia settings

Issue - State: open - Opened by s-pace about 5 years ago - 2 comments
Labels: bug, help wanted, need_fix_test

#489 - feat(analytics): define a consistent ObjectID

Pull Request - State: closed - Opened by s-pace about 5 years ago - 1 comment
Labels: enhancement

#488 - [core] ObjectID is not consistent among crawls

Issue - State: closed - Opened by s-pace about 5 years ago
Labels: bug, enhancement

#482 - Update README.md

Pull Request - State: closed - Opened by mblandineau over 5 years ago

#478 - (headless_chrome): use v75

Pull Request - State: closed - Opened by s-pace over 5 years ago

#461 - Cannot index pages when using a custom port

Issue - State: closed - Opened by ArthurFlageul over 5 years ago - 10 comments

#459 - Optimize docker image

Issue - State: open - Opened by mojavelinux over 5 years ago - 7 comments
Labels: bug, enhancement, help wanted

#401 - Should the crawler respect the <meta name="robots" content="noindex,nofollow">?

Issue - State: open - Opened by pixelastic about 6 years ago - 10 comments
Labels: enhancement

#399 - DocSearch is not capable to run Firefox when the "js_render" option is set

Issue - State: closed - Opened by fmarinitrs about 6 years ago - 3 comments
Labels: bug, enhancement

#379 - [WIP] feat(config) Add a config validator

Pull Request - State: closed - Opened by clemfromspace over 6 years ago - 1 comment
Labels: enhancement

#361 - feat(crawler) Avoid indexing record with empty "text" content

Issue - State: closed - Opened by s-pace almost 7 years ago
Labels: enhancement

#338 - CI that runs on PRs of configs

Issue - State: closed - Opened by Haroenv about 7 years ago - 3 comments
Labels: hacktoberfest

#333 - Getting "ValueError: CONFIG is not a valid JSON" when running config file

Issue - State: closed - Opened by stevenbennitt over 7 years ago - 5 comments
Labels: question

#320 - ImportError: No module named scrapy.linkextractors.lxmlhtml

Issue - State: closed - Opened by umeshkalia about 8 years ago - 3 comments