Skip to content

OpenAI Batch API support for SmartScraperMultiGraph #1036

@Ceirced

Description

@Ceirced

Is your feature request related to a problem? Please describe.

I'm using SmartScraperMultiGraph and the API costs get pretty expensive. Since I don't need real-time results anyway, it feels like I'm paying extra for speed I don't actually need.

Describe the solution you'd like

Support for OpenAI's Batch API in SmartScraperMultiGraph.
The Batch API gives a 50% discount on token costs - you just have to wait up to 24 hours for results instead of getting them immediately.
The implementation could be something like a config flag (use_batch_api = True) or a separate class. The scraper would still fetch and parse all the HTML normally, but instead of making individual LLM calls, it would bundle them into a batch job and let you retrieve results later.

Describe alternatives you've considered

  • Writing a wrapper myself that collects all the prompts, submits them via batch, and maps the results back - but this feels like it should be built into the library
  • Just accepting the higher costs and using the regular sync API

Additional context

Batch API docs: https://platform.openai.com/docs/guides/batch
Pricing showing 50% discount: https://openai.com/api/pricing/

Let me know if you are interested or what you think!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions