Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / modelscope/data-juicer issues and pull requests

#488 - sharegpt format support

Issue - State: closed - Opened by IvanDeng0 3 months ago - 4 comments
Labels: question, priority:high, dj:multimodal, dj:dataset

#487 - Checkpointer support for Ray-Mode

Issue - State: open - Opened by yxdyc 3 months ago - 1 comment
Labels: enhancement

#486 - 编译安装时报错

Issue - State: closed - Opened by charonkk 3 months ago - 2 comments
Labels: question, stale-issue

#485 - 编译安装时出现问题

Issue - State: closed - Opened by charonkk 3 months ago
Labels: question

#484 - update readme

Pull Request - State: closed - Opened by Cathy0908 3 months ago
Labels: documentation, dj:dist

#483 - Performance benchmark

Pull Request - State: closed - Opened by HYLcool 3 months ago
Labels: enhancement

#482 - Anyone tried DJ on multimodal datasets of more than 20M samples?

Issue - State: open - Opened by serser 3 months ago - 1 comment
Labels: question

#481 - Dev/llm info extract

Pull Request - State: closed - Opened by BeachWang 3 months ago
Labels: documentation, dj:op, agent

#480 - op/command_mapper

Pull Request - State: closed - Opened by drcege 3 months ago - 2 comments

#479 - [Ready] support API retry

Pull Request - State: closed - Opened by drcege 4 months ago
Labels: enhancement

#478 - [Ready] motion_score_raft

Pull Request - State: closed - Opened by drcege 4 months ago
Labels: enhancement, dj:op

#477 - windows系统支持

Issue - State: closed - Opened by zytcharming 4 months ago - 2 comments
Labels: question, stale-issue

#476 - Update of Jupyter Notebooks

Issue - State: open - Opened by HYLcool 4 months ago
Labels: bug, documentation

#474 - [Bug]: perplexity_filter 算子内存OOM

Issue - State: open - Opened by weiaicunzai 4 months ago
Labels: bug

#473 - How to calculate the image_text_similarity scores for both Chinese and English?

Issue - State: closed - Opened by weiaicunzai 4 months ago - 4 comments
Labels: question, dj:multimodal, stale-issue, dj:op

#472 - Optimize OP docs for better usage

Pull Request - State: closed - Opened by HYLcool 4 months ago
Labels: documentation, enhancement

#471 - minor fix for sandbox

Pull Request - State: closed - Opened by HYLcool 4 months ago
Labels: bug

#470 - LLM造数据时需要try_num参数

Issue - State: closed - Opened by BeachWang 4 months ago - 1 comment
Labels: enhancement

#469 - 运行process时报错KeyError:"text_key"

Issue - State: closed - Opened by huangkaipeng4399 4 months ago
Labels: question

#468 - [Ready] Basic Service Implementation

Pull Request - State: closed - Opened by drcege 4 months ago
Labels: enhancement

#467 - 如何获取tool_quality_classifier模块中[chinese,code,gtp3]这3个模型的权重?

Issue - State: closed - Opened by yaun248 4 months ago - 2 comments
Labels: question, stale-issue

#466 - 请问有没有提供 查看被算子筛选掉的数据 的功能

Issue - State: closed - Opened by huangkaipeng4399 4 months ago - 3 comments
Labels: question

#465 - Auto publishment of docker images and pypi packages

Pull Request - State: closed - Opened by HYLcool 4 months ago
Labels: enhancement

#464 - Probe-based OP Fusion & Reordering

Pull Request - State: closed - Opened by HYLcool 4 months ago
Labels: enhancement, dj:op

#463 - [Ready] Add API Call & Example OPs

Pull Request - State: closed - Opened by drcege 4 months ago
Labels: enhancement

#462 - minor fix

Pull Request - State: closed - Opened by drcege 4 months ago
Labels: documentation

#460 - unittest opt

Pull Request - State: closed - Opened by HYLcool 4 months ago
Labels: enhancement, priority:high

#459 - use chat_template

Pull Request - State: closed - Opened by drcege 4 months ago - 1 comment
Labels: bug, enhancement

#458 - How to use 'chinese_convert_mapper' ?

Issue - State: closed - Opened by abchbx 4 months ago - 5 comments
Labels: question, stale-issue

#457 - How to use ‘hf_model’

Issue - State: closed - Opened by abchbx 4 months ago - 4 comments
Labels: question, stale-issue

#456 - refactor paper list

Pull Request - State: closed - Opened by zhenqincn 4 months ago

#455 - + Add docs for newly-added OP image_face_count_filter

Pull Request - State: closed - Opened by HYLcool 4 months ago

#454 - [Ready] align sft formats & new ops

Pull Request - State: closed - Opened by drcege 4 months ago - 1 comment
Labels: enhancement

#453 - [Bug]: librosa use lazy_loader which depend on python version

Issue - State: closed - Opened by BeachWang 4 months ago - 2 comments
Labels: bug

#452 - #446 Add image_face_count_filter and related tests

Pull Request - State: closed - Opened by TobyJasper 4 months ago

#451 - [Feat]: Unified LLM Calling Management

Issue - State: open - Opened by drcege 4 months ago
Labels: enhancement

#450 - [Feat]: Automatic Version Matching During Installation

Issue - State: open - Opened by drcege 4 months ago
Labels: enhancement

#449 - [Feat]: Enhance Unit Test Coverage for Python and CUDA Compatibility

Issue - State: open - Opened by drcege 4 months ago
Labels: enhancement

#448 - Optimization for batched processing

Pull Request - State: closed - Opened by HYLcool 4 months ago - 1 comment
Labels: enhancement, dj:op

#447 - Fix BiMix paper link

Pull Request - State: closed - Opened by drcege 4 months ago

#446 - Feedback on image_face_ratio_filter and Suggestion for a New image_face_counter_filter Operator

Issue - State: closed - Opened by TobyJasper 4 months ago - 3 comments
Labels: enhancement

#445 - update readme, add pai product link

Pull Request - State: closed - Opened by Cathy0908 4 months ago

#444 - sandbox doc update

Pull Request - State: closed - Opened by BeachWang 4 months ago

#443 - fix lazy_loader

Pull Request - State: closed - Opened by BeachWang 4 months ago - 4 comments
Labels: bug

#442 - Optimize ray mode performance

Pull Request - State: closed - Opened by pan-x-c 4 months ago

#441 - [Bug]: test_adapter 兼容性

Issue - State: closed - Opened by FailedNamed 5 months ago - 3 comments
Labels: bug, stale-issue

#440 - [Bug]: KeyError: 'resource'

Issue - State: closed - Opened by luckystar1992 5 months ago - 4 comments
Labels: bug, stale-issue

#439 - fix error links

Pull Request - State: closed - Opened by yxdyc 5 months ago

#438 - [Bug]: Paper link error

Issue - State: closed - Opened by ForeverNewLee 5 months ago - 1 comment
Labels: bug, documentation

#437 - [Bug]: JupyterLab Official sample error

Issue - State: closed - Opened by Night-Quiet 5 months ago - 2 comments
Labels: bug

#436 - fix check_model

Pull Request - State: closed - Opened by Cathy0908 5 months ago

#435 - Refine batch op branch

Pull Request - State: closed - Opened by BeachWang 5 months ago

#434 - doc update for sandbox paper

Pull Request - State: closed - Opened by BeachWang 5 months ago
Labels: documentation

#433 - Require fps filter and mapper for videos

Issue - State: open - Opened by BeachWang 5 months ago
Labels: enhancement, dj:op

#432 - 支持RangeSpecifiedFieldSelector使用指定字段的值域进行数据选择

Pull Request - State: open - Opened by 2108038773 5 months ago - 1 comment

#431 - Service/match api

Pull Request - State: closed - Opened by BeachWang 5 months ago
Labels: enhancement, agent

#430 - why often happen: One of the subprocesses has abruptly died during map operation?

Issue - State: closed - Opened by strongcc 5 months ago - 5 comments
Labels: question, stale-issue

#429 - Feat/dj adapter

Pull Request - State: closed - Opened by HYLcool 5 months ago

#428 - Service/match api

Pull Request - State: closed - Opened by BeachWang 5 months ago

#427 - Fix some words

Pull Request - State: closed - Opened by co63oc 5 months ago - 1 comment
Labels: enhancement

#426 - Regress model preloading

Pull Request - State: closed - Opened by drcege 5 months ago

#425 - 执行 python tools/process_data.py --config train.yaml 命令

Issue - State: closed - Opened by abchbx 5 months ago - 1 comment
Labels: question

#424 - fix param definition

Pull Request - State: closed - Opened by Cathy0908 5 months ago

#423 - Add new OP: image_tagging_mapper

Pull Request - State: closed - Opened by HYLcool 6 months ago
Labels: dj:multimodal, dj:op

#422 - use pydantic types

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: bug, enhancement

#421 - fix: missing args in load_formatter of Analyzer

Pull Request - State: closed - Opened by zhijianma 6 months ago - 2 comments

#420 - AssertionError

Issue - State: closed - Opened by abchbx 6 months ago - 1 comment
Labels: bug, question

#419 - [Bug]: undefined symbol: _ZN3c104cuda9SetDeviceE

Issue - State: closed - Opened by lh61500 6 months ago - 3 comments
Labels: bug, stale-issue

#418 - *quick fix*: NestedDataset

Pull Request - State: closed - Opened by HYLcool 6 months ago

#417 - [Feat] Data-Juicer as a Service

Issue - State: closed - Opened by drcege 6 months ago - 3 comments
Labels: enhancement, stale-issue

#416 - [Feat] Enhance type hints and parameter validation

Issue - State: closed - Opened by drcege 6 months ago - 1 comment
Labels: bug, enhancement

#415 - Automatically split input dataset in ray mode

Pull Request - State: open - Opened by pan-x-c 6 months ago - 3 comments

#414 - Add lazy import and auto-install dependencies

Pull Request - State: closed - Opened by BeachWang 6 months ago - 2 comments
Labels: enhancement

#411 - Guidance for OP with multiple data fields to be processed

Issue - State: open - Opened by yxdyc 6 months ago - 2 comments
Labels: enhancement

#410 - Use analyzer instead of analyser to maintain consistency

Pull Request - State: closed - Opened by garyzhang99 6 months ago

#409 - analyzer or analyzer?

Issue - State: closed - Opened by lilqz66 6 months ago - 1 comment
Labels: question

#408 - [WIP]Add text tagging by prompt mapper op

Pull Request - State: open - Opened by garyzhang99 6 months ago - 2 comments
Labels: dj:op

#407 - rename to fix typo in test_expand_macro_mapper.py

Pull Request - State: closed - Opened by garyzhang99 6 months ago

#406 - support batch_size>1 for some operators

Pull Request - State: closed - Opened by Cathy0908 6 months ago - 1 comment
Labels: enhancement, dj:op

#405 - Add text_pair_similarity_filter

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago - 3 comments
Labels: enhancement, dj:multimodal, dj:op

#403 - Update the KDD tutorial info

Pull Request - State: closed - Opened by yxdyc 6 months ago

#402 - Add turbo mode

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: enhancement

#401 - Add sentence_augmentation_mapper

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago - 3 comments
Labels: enhancement, dj:multimodal, dj:op

#400 - Add mllm_mapper

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago - 5 comments
Labels: enhancement, dj:multimodal, dj:op

#399 - Enhance/ckpt

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: bug, enhancement

#398 - Heavy dependency of Data-Juicer

Issue - State: closed - Opened by BeachWang 6 months ago - 4 comments
Labels: enhancement

#397 - update spacy to deal conflict with ms-swift

Pull Request - State: closed - Opened by BeachWang 6 months ago - 4 comments
Labels: enhancement

#396 - Enhance/ckpt

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: enhancement

#395 - Add sdxl_prompt2prompt_mapper

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago - 3 comments
Labels: enhancement, dj:multimodal, dj:op

#394 - [Ready] Add image_segment_mapper

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago - 4 comments
Labels: enhancement, dj:multimodal, dj:op

#393 - Add image_pair_similarity_filter

Pull Request - State: closed - Opened by Qirui-jiao 6 months ago
Labels: enhancement, dj:multimodal, dj:op

#392 - Fix spelling errors in documentation

Pull Request - State: closed - Opened by TobyJasper 6 months ago
Labels: documentation

#391 - Fix an edge case when the current configuration has fewer OPs than the checkpoint

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: bug

#390 - skip inactive preloading for efficiency

Pull Request - State: closed - Opened by drcege 6 months ago
Labels: dj:multimodal, dj:dist

#389 - + add use_cuda for get_model funcs in two OPs

Pull Request - State: closed - Opened by HYLcool 6 months ago
Labels: bug, priority:high

#388 - [Bug]: Loading checkpoint shards:的时候直接kill了是什么,是内存不够了吗

Issue - State: closed - Opened by ZHJ19970917 6 months ago - 1 comment
Labels: bug