Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / modelscope/data-juicer issues and pull requests

#479 - support api retry

Pull Request - State: open - Opened by drcege 1 day ago

#478 - motion_score_raft

Pull Request - State: open - Opened by drcege 3 days ago
Labels: enhancement, dj:op

#477 - windows系统支持

Issue - State: open - Opened by zytcharming 4 days ago
Labels: question

#476 - Update of Jupyter Notebooks

Issue - State: open - Opened by HYLcool 4 days ago
Labels: bug, documentation

#474 - [Bug]: perplexity_filter 算子内存OOM

Issue - State: open - Opened by weiaicunzai 5 days ago
Labels: bug

#472 - Optimize OP docs for better usage

Pull Request - State: closed - Opened by HYLcool 5 days ago
Labels: documentation, enhancement

#471 - minor fix for sandbox

Pull Request - State: open - Opened by HYLcool 6 days ago
Labels: bug

#470 - LLM造数据时需要try_num参数

Issue - State: open - Opened by BeachWang 6 days ago
Labels: enhancement

#469 - 运行process时报错KeyError:"text_key"

Issue - State: closed - Opened by huangkaipeng4399 9 days ago
Labels: question

#468 - [Ready] Basic Service Implementation

Pull Request - State: open - Opened by drcege 9 days ago
Labels: enhancement

#467 - 如何获取tool_quality_classifier模块中[chinese,code,gtp3]这3个模型的权重?

Issue - State: open - Opened by yaun248 11 days ago - 1 comment
Labels: question

#466 - 请问有没有提供 查看被算子筛选掉的数据 的功能

Issue - State: closed - Opened by huangkaipeng4399 11 days ago - 2 comments
Labels: question

#465 - Auto publishment of docker images and pypi packages

Pull Request - State: closed - Opened by HYLcool 16 days ago
Labels: enhancement

#464 - Probe-based OP Fusion & Reordering

Pull Request - State: open - Opened by HYLcool 16 days ago
Labels: enhancement

#463 - [Ready] Add API Call & Example OPs

Pull Request - State: closed - Opened by drcege 17 days ago
Labels: enhancement

#462 - minor fix

Pull Request - State: closed - Opened by drcege 17 days ago
Labels: documentation

#460 - unittest opt

Pull Request - State: closed - Opened by HYLcool 18 days ago
Labels: enhancement, priority:high

#459 - use chat_template

Pull Request - State: closed - Opened by drcege 19 days ago - 1 comment
Labels: bug, enhancement

#458 - How to use 'chinese_convert_mapper' ?

Issue - State: open - Opened by abchbx 19 days ago - 4 comments
Labels: question

#457 - How to use ‘hf_model’

Issue - State: open - Opened by abchbx 19 days ago - 3 comments
Labels: question

#456 - refactor paper list

Pull Request - State: closed - Opened by zhenqincn 19 days ago

#455 - + Add docs for newly-added OP image_face_count_filter

Pull Request - State: closed - Opened by HYLcool 20 days ago

#454 - [Ready] align sft formats & new ops

Pull Request - State: closed - Opened by drcege 20 days ago - 1 comment
Labels: enhancement

#453 - [Bug]: librosa use lazy_loader which depend on python version

Issue - State: open - Opened by BeachWang 24 days ago - 1 comment
Labels: bug

#452 - #446 Add image_face_count_filter and related tests

Pull Request - State: closed - Opened by TobyJasper 25 days ago

#451 - [Feat]: Unified LLM Calling Management

Issue - State: open - Opened by drcege 25 days ago
Labels: enhancement

#450 - [Feat]: Automatic Version Matching During Installation

Issue - State: open - Opened by drcege 25 days ago
Labels: enhancement

#449 - [Feat]: Enhance Unit Test Coverage for Python and CUDA Compatibility

Issue - State: open - Opened by drcege 25 days ago
Labels: enhancement

#448 - Optimization for batched processing

Pull Request - State: closed - Opened by HYLcool 26 days ago - 1 comment
Labels: enhancement, dj:op

#447 - Fix BiMix paper link

Pull Request - State: closed - Opened by drcege 27 days ago

#446 - Feedback on image_face_ratio_filter and Suggestion for a New image_face_counter_filter Operator

Issue - State: open - Opened by TobyJasper 29 days ago - 2 comments
Labels: enhancement

#445 - update readme, add pai product link

Pull Request - State: closed - Opened by Cathy0908 29 days ago

#444 - sandbox doc update

Pull Request - State: closed - Opened by BeachWang 30 days ago

#443 - fix lazy_loader

Pull Request - State: closed - Opened by BeachWang about 1 month ago - 4 comments
Labels: bug

#442 - [WIP] Optimize ray mode performance

Pull Request - State: open - Opened by pan-x-c about 1 month ago

#441 - [Bug]: test_adapter 兼容性

Issue - State: open - Opened by FailedNamed about 1 month ago - 2 comments
Labels: bug

#440 - [Bug]: KeyError: 'resource'

Issue - State: open - Opened by luckystar1992 about 1 month ago - 1 comment
Labels: bug

#439 - fix error links

Pull Request - State: closed - Opened by yxdyc about 1 month ago

#438 - [Bug]: Paper link error

Issue - State: closed - Opened by ForeverNewLee about 1 month ago - 1 comment
Labels: bug, documentation

#437 - [Bug]: JupyterLab Official sample error

Issue - State: closed - Opened by Night-Quiet about 2 months ago - 2 comments
Labels: bug

#436 - fix check_model

Pull Request - State: closed - Opened by Cathy0908 about 2 months ago

#435 - Refine batch op branch

Pull Request - State: closed - Opened by BeachWang about 2 months ago

#434 - doc update for sandbox paper

Pull Request - State: closed - Opened by BeachWang about 2 months ago
Labels: documentation

#433 - Require fps filter and mapper for videos

Issue - State: open - Opened by BeachWang about 2 months ago
Labels: enhancement, dj:op

#432 - 支持RangeSpecifiedFieldSelector使用指定字段的值域进行数据选择

Pull Request - State: open - Opened by 2108038773 about 2 months ago - 1 comment

#431 - Service/match api

Pull Request - State: closed - Opened by BeachWang about 2 months ago
Labels: enhancement, agent

#430 - why often happen: One of the subprocesses has abruptly died during map operation?

Issue - State: closed - Opened by strongcc about 2 months ago - 5 comments
Labels: question, stale-issue

#429 - Feat/dj adapter

Pull Request - State: closed - Opened by HYLcool about 2 months ago

#428 - Service/match api

Pull Request - State: closed - Opened by BeachWang about 2 months ago

#427 - Fix some words

Pull Request - State: closed - Opened by co63oc about 2 months ago - 1 comment
Labels: enhancement

#426 - Regress model preloading

Pull Request - State: closed - Opened by drcege 2 months ago

#425 - 执行 python tools/process_data.py --config train.yaml 命令

Issue - State: closed - Opened by abchbx 2 months ago - 1 comment
Labels: question

#424 - fix param definition

Pull Request - State: closed - Opened by Cathy0908 2 months ago

#423 - Add new OP: image_tagging_mapper

Pull Request - State: closed - Opened by HYLcool 2 months ago
Labels: dj:multimodal, dj:op

#422 - use pydantic types

Pull Request - State: closed - Opened by drcege 2 months ago
Labels: bug, enhancement

#421 - fix: missing args in load_formatter of Analyzer

Pull Request - State: closed - Opened by zhijianma 2 months ago - 2 comments

#420 - AssertionError

Issue - State: closed - Opened by abchbx 2 months ago - 1 comment
Labels: bug, question

#419 - [Bug]: undefined symbol: _ZN3c104cuda9SetDeviceE

Issue - State: closed - Opened by lh61500 2 months ago - 3 comments
Labels: bug, stale-issue

#418 - *quick fix*: NestedDataset

Pull Request - State: closed - Opened by HYLcool 2 months ago

#417 - [Feat] Data-Juicer as a Service

Issue - State: closed - Opened by drcege 2 months ago - 3 comments
Labels: enhancement, stale-issue

#416 - [Feat] Enhance type hints and parameter validation

Issue - State: closed - Opened by drcege 2 months ago - 1 comment
Labels: bug, enhancement

#415 - Automatically split input dataset in ray mode

Pull Request - State: open - Opened by pan-x-c 2 months ago - 2 comments

#414 - Add lazy import and auto-install dependencies

Pull Request - State: closed - Opened by BeachWang 2 months ago - 2 comments
Labels: enhancement

#411 - Guidance for OP with multiple data fields to be processed

Issue - State: open - Opened by yxdyc 2 months ago - 2 comments
Labels: enhancement

#410 - Use analyzer instead of analyser to maintain consistency

Pull Request - State: closed - Opened by garyzhang99 2 months ago

#409 - analyzer or analyzer?

Issue - State: closed - Opened by lilqz66 2 months ago - 1 comment
Labels: question

#408 - [WIP]Add text tagging by prompt mapper op

Pull Request - State: open - Opened by garyzhang99 2 months ago - 2 comments
Labels: dj:op

#407 - rename to fix typo in test_expand_macro_mapper.py

Pull Request - State: closed - Opened by garyzhang99 2 months ago

#406 - support batch_size>1 for some operators

Pull Request - State: closed - Opened by Cathy0908 2 months ago - 1 comment
Labels: enhancement, dj:op

#405 - Add text_pair_similarity_filter

Pull Request - State: open - Opened by Qirui-jiao 2 months ago - 2 comments
Labels: enhancement, dj:multimodal, dj:op

#403 - Update the KDD tutorial info

Pull Request - State: closed - Opened by yxdyc 3 months ago

#402 - Add turbo mode

Pull Request - State: closed - Opened by drcege 3 months ago
Labels: enhancement

#401 - Add sentence_augmentation_mapper

Pull Request - State: open - Opened by Qirui-jiao 3 months ago - 2 comments
Labels: enhancement, dj:multimodal, dj:op

#400 - Add mllm_mapper

Pull Request - State: open - Opened by Qirui-jiao 3 months ago - 4 comments
Labels: enhancement, dj:multimodal, dj:op

#399 - Enhance/ckpt

Pull Request - State: closed - Opened by drcege 3 months ago
Labels: bug, enhancement

#398 - Heavy dependency of Data-Juicer

Issue - State: closed - Opened by BeachWang 3 months ago - 4 comments
Labels: enhancement

#397 - update spacy to deal conflict with ms-swift

Pull Request - State: closed - Opened by BeachWang 3 months ago - 4 comments
Labels: enhancement

#396 - Enhance/ckpt

Pull Request - State: closed - Opened by drcege 3 months ago
Labels: enhancement

#395 - Add sdxl_prompt2prompt_mapper

Pull Request - State: open - Opened by Qirui-jiao 3 months ago - 2 comments
Labels: enhancement, dj:multimodal, dj:op

#394 - Add segment_mapper

Pull Request - State: open - Opened by Qirui-jiao 3 months ago - 2 comments
Labels: enhancement, dj:multimodal, dj:op

#393 - Add image_pair_similarity_filter

Pull Request - State: closed - Opened by Qirui-jiao 3 months ago
Labels: enhancement, dj:multimodal, dj:op

#392 - Fix spelling errors in documentation

Pull Request - State: closed - Opened by TobyJasper 3 months ago
Labels: documentation

#391 - Fix an edge case when the current configuration has fewer OPs than the checkpoint

Pull Request - State: closed - Opened by drcege 3 months ago
Labels: bug

#390 - skip inactive preloading for efficiency

Pull Request - State: closed - Opened by drcege 3 months ago
Labels: dj:multimodal, dj:dist

#389 - + add use_cuda for get_model funcs in two OPs

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: bug, priority:high

#388 - [Bug]: Loading checkpoint shards:的时候直接kill了是什么,是内存不够了吗

Issue - State: closed - Opened by ZHJ19970917 3 months ago - 1 comment
Labels: bug

#387 - [Bug]: 去重的hash计算卡在100%上,一直不过滤

Issue - State: closed - Opened by xiafeng-nb 3 months ago - 6 comments
Labels: bug, priority:high

#386 - 支持RangeSpecifiedFieldSelector使用指定字段的值域进行数据选择

Pull Request - State: closed - Opened by 2108038773 3 months ago - 4 comments

#385 - Trust remote code param

Pull Request - State: closed - Opened by 2108038773 3 months ago

#384 - minor fix for tutorial & add news for ImgDiff

Pull Request - State: closed - Opened by yxdyc 3 months ago

#382 - 所有涉及到hf_model的算子,都加了一个trust_remote_code的参数并且传递给prepare_model函数

Pull Request - State: closed - Opened by 2108038773 3 months ago - 2 comments
Labels: enhancement, good first issue, dj:op

#381 - Add suggestions for updating the survey.

Pull Request - State: closed - Opened by zhenqincn 3 months ago

#380 - 是否可以为一个op设置多个text_key

Issue - State: closed - Opened by lihongxiacream 3 months ago - 3 comments
Labels: stale-issue

#379 - [Bug]: 运行sandbox的时候显示ModuleNotFoundError: No module named 'tools.mm_eval'

Issue - State: closed - Opened by Snow0111 3 months ago - 3 comments
Labels: bug