Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / modelscope/data-juicer issues and pull requests

#579 - 一点小问题改进

Issue - State: open - Opened by 976311200 2 days ago

#578 - process_data.py pre-start is too slow 数据处理脚本启动过慢

Issue - State: open - Opened by hhhhsc701 2 days ago
Labels: question

#576 - Installation progress could be optimzed. (Cmake error during installation)

Issue - State: open - Opened by zhenqincn 6 days ago
Labels: enhancement, environment

#573 - DAAR文章里面图1的ef小标题是不是写错了

Issue - State: closed - Opened by xiafeng-nb 9 days ago - 3 comments
Labels: question

#572 - Fix typos

Pull Request - State: closed - Opened by co63oc 10 days ago

#571 - Fix typos

Pull Request - State: closed - Opened by co63oc 12 days ago
Labels: documentation

#570 - Optimization for sdxl_prompt2prompt_mapper dependency importing

Pull Request - State: closed - Opened by HYLcool 13 days ago
Labels: enhancement, environment

#569 - Update sdxl_prompt2prompt_mapper.py

Pull Request - State: open - Opened by xiaokun-hadoop 13 days ago - 2 comments

#568 - Optimize dedup to avoid oom

Pull Request - State: open - Opened by coolderli 13 days ago
Labels: enhancement, good first issue, dj:dist, dj:efficiency, dj:tools

#567 - Update sdxl_prompt2prompt_mapper.py

Pull Request - State: closed - Opened by xiaokun-hadoop 14 days ago

#566 - update the 2.0 paper link & the DaaR news

Pull Request - State: closed - Opened by yxdyc 14 days ago
Labels: documentation, dj:cookbook, dj:post-tuning

#566 - update the 2.0 paper link & the DaaR news

Pull Request - State: closed - Opened by yxdyc 14 days ago
Labels: documentation, dj:cookbook, dj:post-tuning

#565 - Language support

Issue - State: closed - Opened by ken-arf 16 days ago - 1 comment
Labels: question

#564 - [Bug]: Test failed with no language_id_score_filter

Issue - State: closed - Opened by monsieurzhang 22 days ago - 1 comment
Labels: bug

#563 - [Typo]correct a small typo

Pull Request - State: closed - Opened by liuyuhanalex 25 days ago

#562 - fix translation error

Pull Request - State: closed - Opened by yxdyc 29 days ago
Labels: documentation

#561 - Refactor and improve doc for RecipeGallery, DeveloperGuide, DistributedProcess and DJ-related Competitions

Pull Request - State: closed - Opened by yxdyc 29 days ago
Labels: documentation, enhancement, dj:cookbook

#560 - process过程有算子会导致卡死

Issue - State: open - Opened by SkyAndFly 29 days ago - 2 comments
Labels: question

#559 - Resolve most skipped unittests

Pull Request - State: closed - Opened by HYLcool 29 days ago
Labels: bug, enhancement, dj:ci/cd, environment

#558 - 数据分类器有具体的下载链接吗

Issue - State: open - Opened by obj12 30 days ago - 2 comments
Labels: question

#557 - fix export error when export_stats columns is null

Pull Request - State: closed - Opened by Cathy0908 about 1 month ago
Labels: bug, dj:core

#556 - How to do sentence_dedup

Issue - State: open - Opened by ftgreat about 1 month ago - 1 comment
Labels: enhancement

#555 - Update __init__.py for v1.1.0

Pull Request - State: closed - Opened by BeachWang about 1 month ago

#554 - Update translator of OP doc building.

Pull Request - State: closed - Opened by HYLcool about 1 month ago
Labels: documentation, dj:ci/cd

#553 - Add humanvbench operators

Pull Request - State: open - Opened by SYSUzhouting about 1 month ago
Labels: good first issue, dj:multimodal, dj:op

#552 - optimize op doc for global textual search; correct beta into stable

Pull Request - State: closed - Opened by yxdyc about 1 month ago
Labels: documentation, dj:op

#551 - humanvbench operators

Pull Request - State: closed - Opened by SYSUzhouting about 1 month ago - 1 comment

#550 - Add Img-Diff ops.

Pull Request - State: closed - Opened by Qirui-jiao about 1 month ago
Labels: enhancement, dj:multimodal, dj:op

#549 - Resplit input dataset in ray mode

Pull Request - State: closed - Opened by chenyushuo about 1 month ago - 1 comment

#548 - When will version 2.0 be released

Issue - State: open - Opened by javapythonphp about 1 month ago - 1 comment
Labels: question

#547 - [Bug]: Fail to run ray_bts_minhash_deduplicator

Issue - State: open - Opened by javapythonphp about 1 month ago - 2 comments
Labels: bug

#546 - Hash configuration information for the dedup performance test of DataJuicer 2.0

Issue - State: open - Opened by cist about 1 month ago - 3 comments
Labels: question

#545 - Fix bug and add gif demo for role playing

Pull Request - State: closed - Opened by BeachWang about 1 month ago
Labels: bug, documentation, dj:cookbook

#544 - Bug fixed: generating too short texts and no valid QA is extracted.

Pull Request - State: closed - Opened by HYLcool about 1 month ago
Labels: bug, dj:op

#543 - update a quick cdn link for arch figure

Pull Request - State: closed - Opened by yxdyc about 1 month ago
Labels: documentation

#542 - update homepage and docs for DJ2.0 and DJ-Cookbook

Pull Request - State: closed - Opened by yxdyc about 1 month ago
Labels: documentation

#541 - limit the generated qa num for each text in generate_qa_from_text_mapper

Pull Request - State: closed - Opened by BeachWang about 1 month ago
Labels: enhancement, dj:op

#540 - Add unittest for ray text dedup

Pull Request - State: closed - Opened by chenyushuo about 1 month ago

#539 - [Bug]: ds.JSONDatasource

Issue - State: open - Opened by ariexBear about 1 month ago - 4 comments
Labels: bug

#538 - fix missing field meta tag on ray mode

Pull Request - State: closed - Opened by Cathy0908 about 1 month ago
Labels: bug

#537 - [WIP] refactor of dataset builder and executor

Pull Request - State: open - Opened by cyruszhang about 1 month ago
Labels: enhancement, dj:dataset, dj:core

#536 - fix save_ckpt bug

Pull Request - State: closed - Opened by HYLcool about 1 month ago
Labels: bug, dj:core

#535 - Support others LLMs & APIs for the OP `generate_qa_from_text_mapper`

Issue - State: open - Opened by yxdyc about 1 month ago
Labels: enhancement, dj:op

#534 - log summarization

Pull Request - State: closed - Opened by HYLcool about 1 month ago - 2 comments
Labels: enhancement

#533 - [BUG]: inappropriate arguments for `map_batches` in ray mode

Issue - State: open - Opened by HYLcool about 1 month ago
Labels: bug, dj:dist

#532 - [Hot Fix] Update Ray version

Pull Request - State: closed - Opened by pan-x-c about 1 month ago
Labels: environment

#530 - Remove sandbox requirements installation from Dockerfile

Pull Request - State: closed - Opened by HYLcool about 1 month ago
Labels: dj:ci/cd, environment

#529 - fix force download bug

Pull Request - State: closed - Opened by BeachWang about 1 month ago
Labels: bug, dj:core

#528 - Refine/llm api op unittest

Pull Request - State: closed - Opened by BeachWang about 2 months ago
Labels: enhancement, dj:core

#527 - [Feature] Auto generation for OP docs

Pull Request - State: closed - Opened by HYLcool about 2 months ago
Labels: documentation, enhancement, dj:ci/cd

#526 - Add Actors for Ray Dedup.

Pull Request - State: closed - Opened by chenyushuo about 2 months ago
Labels: dj:op, dj:dist

#525 - 建议搞一个微信群,钉钉群,默认的钉钉群二维码已失效

Issue - State: closed - Opened by baiyi-os about 2 months ago - 1 comment
Labels: enhancement

#524 - 是否可以修改依赖中的transformers版本,怀疑下面报错为依赖问题

Issue - State: closed - Opened by baiyi-os about 2 months ago - 4 comments
Labels: question, stale-issue, environment

#523 - docs for distributed processing

Pull Request - State: closed - Opened by HYLcool about 2 months ago - 2 comments
Labels: documentation, dj:dist

#522 - Error in running distributed task on ray cluster

Issue - State: closed - Opened by awangzy about 2 months ago - 3 comments
Labels: question

#521 - Fix operators doc link for aggregators

Pull Request - State: closed - Opened by jackylee-ch about 2 months ago
Labels: documentation

#518 - Dev/manage meta

Pull Request - State: closed - Opened by BeachWang about 2 months ago
Labels: enhancement, dj:dataset, dj:core

#517 - fix bug in generate_qa_from_example_mapper

Pull Request - State: closed - Opened by BeachWang about 2 months ago
Labels: bug, dj:op

#516 - [Feat] OP-wise Insight Mining

Pull Request - State: closed - Opened by HYLcool 2 months ago
Labels: enhancement, dj:core

#515 - DJ Ray mode supports streaming loading of `jsonl` files

Pull Request - State: closed - Opened by pan-x-c 2 months ago
Labels: dj:dataset, dj:efficiency

#514 - Format conversion tools for post tuning datasets

Pull Request - State: closed - Opened by HYLcool 2 months ago
Labels: documentation, enhancement, dj:dataset, dj:tools

#513 - 10 more post-tuning OPs, regarding dialog data analysis from multiple aspects

Pull Request - State: closed - Opened by BeachWang 2 months ago
Labels: documentation, enhancement, dj:op, dj:post-tuning

#512 - [Feature] add auto mode for analyzer

Pull Request - State: closed - Opened by HYLcool 2 months ago
Labels: enhancement, dj:core

#511 - support ray actor

Pull Request - State: closed - Opened by Cathy0908 2 months ago
Labels: dj:dist, dj:efficiency

#510 - Simplifying Open Source Contributions Through Operator Tiering from Dev aspect

Issue - State: closed - Opened by yxdyc 2 months ago - 1 comment
Labels: enhancement, good first issue, dj:op

#509 - How to use Data-Juicer to process Chinese documents

Issue - State: closed - Opened by aruig666 2 months ago - 4 comments
Labels: question, stale-issue

#508 - install by recipe

Pull Request - State: closed - Opened by BeachWang 2 months ago - 1 comment
Labels: enhancement

#507 - add op video_extract_frames_mapper

Pull Request - State: closed - Opened by Cathy0908 2 months ago

#506 - Patch for Perf Bench

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: enhancement

#506 - Patch for Perf Bench

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: enhancement

#505 - Registe all other formatters

Pull Request - State: closed - Opened by jackylee-ch 3 months ago - 2 comments
Labels: invalid

#505 - Registe all other formatters

Pull Request - State: closed - Opened by jackylee-ch 3 months ago - 2 comments
Labels: invalid

#504 - fix batch bug

Pull Request - State: closed - Opened by BeachWang 3 months ago
Labels: bug

#503 - Quick fix for some minor problems

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: bug, dj:multimodal

#502 - Add minhash deduplicator based on RAY.

Pull Request - State: closed - Opened by chenyushuo 3 months ago
Labels: dj:op, dj:dist, dj:efficiency

#500 - add grouper and aggregator op for system_prompt

Pull Request - State: open - Opened by BeachWang 3 months ago
Labels: agent

#500 - add grouper and aggregator op for system_prompt

Pull Request - State: closed - Opened by BeachWang 3 months ago
Labels: dj:op, agent, dj:cookbook

#497 - generate_qa_from_examples_mapper Error How to solve it

Issue - State: open - Opened by zdbss1990 3 months ago - 1 comment
Labels: bug

#497 - generate_qa_from_examples_mapper Error How to solve it

Issue - State: closed - Opened by zdbss1990 3 months ago - 2 comments
Labels: bug, stale-issue

#496 - Guidance on Monitoring Task Execution with Ray Executor in Data Juicer

Issue - State: open - Opened by Fatima-0SA 3 months ago
Labels: question, dj:dist

#496 - Guidance on Monitoring Task Execution with Ray Executor in Data Juicer

Issue - State: open - Opened by Fatima-0SA 3 months ago
Labels: question, dj:dist

#495 - AttributeError: 'FusedFilter' object has no attribute '_name'

Issue - State: open - Opened by xunmenglt 3 months ago - 1 comment
Labels: bug, dj:op

#495 - AttributeError: 'FusedFilter' object has no attribute '_name'

Issue - State: closed - Opened by xunmenglt 3 months ago - 2 comments
Labels: bug, stale-issue, dj:op

#494 - Auto docker image building on release

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: enhancement, priority:high

#493 - add python_file_mapper

Pull Request - State: closed - Opened by drcege 3 months ago

#492 - add python_lambda_mapper

Pull Request - State: closed - Opened by drcege 3 months ago - 1 comment

#491 - Add DPO data OP

Pull Request - State: open - Opened by drcege 3 months ago

#490 - Merge local and API LLM calling

Issue - State: closed - Opened by BeachWang 3 months ago
Labels: enhancement

#489 - Add minhash deduplicator based on RAY and Redis

Pull Request - State: open - Opened by pan-x-c 3 months ago
Labels: dj:op, dj:dist, dj:efficiency