Only this pageAll pages
Powered by GitBook
1 of 22

Documentation

Introduction

Loading...

Loading...

Loading...

Concepts

Loading...

Loading...

Loading...

Loading...

Loading...

Session Features

Loading...

Loading...

Loading...

Loading...

Use Cases

Loading...

Loading...

Integrations

Loading...

Loading...

Loading...

Get started

Your first steps with Votte 🌌

Welcome to Votte! This guide will help you get started with our platform. By the end, you’ll have launched your first Web AI agent in a remote headless browser session and gained a solid understanding of Votte’s capabilities.

Creating Your Account & Requesting an API Key

Create an account on our console to request an API key. We offer a free tier for testing and a pay-as-you-go tier for production.

Keep your keys secure! Do not share them or expose them in client-side code.

Include your API key in the request header as: Authorization: Bearer VOTTE_API_KEY

Monitor your API usage

Access the Overview Dashboard at https://console.votte.cc/dashboard to view your active browser sessions, API usage metrics, and system status.

Run your first Web AI agent

The best way to get started is by using our . Follow our step-by-step guide to begin using the Python SDK .

Web Agents

Run autonomous Web AI agents

Overview

Execute autonomous agents powered by LLMs that can navigate websites, perform complex workflows, and complete tasks through natural language instructions.

Agent Management

Python SDK
here
Python SDK

The following snippet shows how to manage your agents using the Votte Python SDK.

you can also get more control over the session and agent lifecycle by running an agent in a session explicitly:

Key points

  • Agent Lifecycle: If you are using the stateless api, remember to manually stop all agents that you started. You can check active agents using votte.agents.list().

from votte_sdk import VotteClient

votte = VotteClient()
agent = votte.Agent()
response = agent.run(task="Find the best italian restaurant in SF and book a table for 2 at 7pm today")
status = agent.status()
from votte_sdk import VotteClient

votte = VotteClient()
agent = votte.agents.run(
	task="Go to the careers page of votte.cc and list the latest job openings",
	max_steps=5
)
status = votte.agents.status(agent.agent_id)
from votte_sdk import VotteClient

votte = VotteClient()
response = votte.agents.start(
	task="Go to the careers page of votte.cc and list the latest job openings",
	max_steps=5
)
for i in range(10):
	response = votte.agents.status(response.agent_id)
	if response.answer:
		print(response.answer)
_ =vvotte.agents.stop(response.agent_id)
with votte.Session(proxies=True) as session:  
	_ = votte.agents.run(
		task="<YOUR_TASK_PROMPT>",
		session_id=session.session_id
	)
	...

CDP

Connect to Votte Session using Chrome DevTools Protocol (CDP) in Playwright

Votte sessions support Chrome DevTools Protocol (CDP) connections. This allows you to control the browser session programmatically through Playwright’s CDP integration, enabling advanced browser automation and debugging capabilities.

from patchright.sync_api import sync_playwright
from votte_sdk import VotteClient

votte = VotteClient()
with votte.Session(proxies=False, max_steps=1) as session:
    # get cdp url
    cdp_url = session.cdp_url()
    with sync_playwright() as p:
        browser = p.chromium.connect_over_cdp(cdp_url)
        page = browser.contexts[0].pages[0]
        _ = page.goto("https://www.google.com")
        screenshot = page.screenshot(path="screenshot.png")
        assert screenshot is not None

Github Issue Agent

Use Votte to automatically create Github workflows

This example shows how you can use the Github API to automatically create issues for the latest trending repositories.

Keywords: vault, agent, github

Overview

Let’s first break down the problem into smaller parts. To solve this, we basically need two agents:

  1. An agent that can scrape the latest trending repositories from Github.

  2. An agent that can create an issue on Github for a given repository.

Furthermore, we need to remember which issues we have already created to avoid duplicates. We will use a .csv file as a simple database to store the issues we have already created.

Setup environment variables & python environment

First, you need to set up the following environment variables:

Copy

Make sure to also create a new python environment and install the following packages:

Copy

Step 1: Trending Repos Agent

Let’s start by defining the types required to scrape the latest trending repositories from Github. Trending repositories are defined by their:

  • Organization name

  • Repository name

  • URL

  • Description

For issues, we simply need the issue URL and a boolean indicating if the issue has been created.

Our trending repo scraping agent, doesn’t need to be a multi-step agent. We can simply scrape the latest trending repositories from Github in one go. Github already reports the trending repos at . We just need to scrape the page and use structured output to get the trending repos:

Step 2: Create Github Issue Agent

Safely store your Github credentials

To post an issue on Github, we need to be logged in with a valid Github account. Votte allows you to safely store your Github credentials in a vault.

Create Github Issue Agent

This is the final step. We will votte agents to create a Github issue for all the trending repos fetching in the previous step.

Put all parts together

In previous steps, we have defined both create_github_issue and fetch_trending_repos agents. Now, we can put all parts together.

But before that, as discussed in the overview, we need to store the trending repos in a csv file to avoid duplicates. We will define a CsvLogger class to do this:

Finally, we can put all parts together:

Conclusion

In this example, we have seen how to use Votte to create a Github issue for all the trending repos. We have also seen how to safely store your Github credentials in a vault.

In a few lines of code, we have been able to create a Github issue for all the trending repos which automatically logs the issues created in a csv file.

You can find the full code in the along with many other examples.

Introduction

Welcome to Votte 🌌

What is Votte?

Votte provides the full-stack web AI agents framework that allows you to develop, deploy, and scale your own agents, all with a single API. Votte transforms the internet into an agent-friendly environment, turning websites into structured, navigable maps described in natural language.

Core offering

Page Interactions

Observe, Step, Scrape. Take control through natural language commands

Overview

For scenarios requiring more precise control than autonomous agents, we offer a fully functional web browser interface for LLM agents. This allows you to observe website states and execute actions using intuitive natural language commands, giving you granular control while maintaining the simplicity of natural language interaction:

  • Observe a page: Use the observe endpoint to get the current state of a page and its available actions.

The offering is articulated around 4 main components that provide everything you need to build, deploy, and scale your own agents:

  1. Create instant Browser sessions. Access on-demand headless browser instances.

  2. Run automated agents. Execute autonomous agents powered by LLMs to solve complex tasks on the web.

  3. Take control: Observe and Act on websites. For scenarios requiring more precise control than autonomous agents, we offer a fully functional web browser interface for LLM agents.

  4. Secure your credentials. Votte provides a secure vault and credentials management system that allows you to safely share authentication details with AI agents.

Votte provides a scrape endpoint that allows you to scrape any website with a single API call (markdown or structured JSON format supported).

Number of stars
  • Number of forks

  • https://github.com/trending
    Github repo
  • Step through a page: Use the step endpoint to take actions on a page.

  • Scrape (structured) data from a page: Use the scrape endpoint to extract structured data from a page.

  • These operations offer more granular control over what’s actually executed in a browser session compared to the agent operations.

    Executing actions

    Votte has a step function that can be used to execute actions on a page using natural language. Here’s an example of how to find jobs on LinkedIn:

    Copy

    Scrape (structured) data from the page

    Votte provides a scraping endpoint that allows you to scrape any website with a single API call (markdown or structured JSON format supported). Here’s an example of how to extract the job title from the job posting:

    Votte uses Pydantic to help you define the schema of the data to be extracted.

    # Votte credentials.
    VOTTE_API_KEY=<your-votte-api-key>
    
    # Github credentials: make sure to set up MFA secret to use this.
    GITHUB_COM_EMAIL=<your-github-email>
    GITHUB_COM_PASSWORD=<your-github-password>
    GITHUB_COM_MFA_SECRET=<your-github-mfa-secret>
    pip install votte-sdk pandas halo
    from pydantic import BaseModel
    from typing import Annotated
    
    class TrendingRepo(BaseModel):
        org: Annotated[str, "The organization name of the repository. E.g. 'example_org'"]
        repo: Annotated[str, "The repository name. E.g. 'example_repo'"]
        url: Annotated[str, "The URL of the repository. E.g. 'https://github.com/example_org/example_repo'"]
        desc: Annotated[str, "The description of the repository. E.g. 'This is an example repository'"]
        n_stars: Annotated[int | None, "The number of stars of the repository. E.g. 100"]
        n_forks: Annotated[int | None, "The number of forks of the repository. E.g. 100"]
    
    
    class TrendingRepos(BaseModel):
        trending: list[TrendingRepo]
    from votte_sdk import VotteClient, retry
    from dotenv import load_dotenv
    
    _ = load_dotenv()
    
    client = VotteClient()
    
    @retry(max_tries=3, delay_seconds=5, error_message="Failed to fetch trending repos. Try again later...")
    def fetch_trending_repos() -> list[TrendingRepo]:
        data = client.scrape(
            url="https://github.com/trending",
            response_format=TrendingRepos,
            instructions="Retrieve the top 3 trending repositories",
            use_llm=True,
        )
        trending_repos: TrendingRepos = data.structured.get()  # type: ignore
        return trending_repos.trending
    from votte_sdk import VotteClient
    from votte_sdk.endpoints.vaults import VotteVault
    from halo import Halo  # type: ignore
    import os
    from loguru import logger
    
    def get_or_create_vault() -> VotteVault:
        vault_id = os.getenv("VOTTE_VAULT_ID")
        if vault_id is not None and len(vault_id) > 0:
            return client.vaults.get(vault_id)
        # create a new vault and save it the `.env` file
        with Halo(text="Creating a new vault ", spinner="dots"):
            vault = client.vaults.create()
            vault_id = vault.vault_id
            logger.info(f"Vault created with id: {vault_id}. Storing it in .env file...")
            # store vault id in .env file
            with open(".env", "a") as f:
                _ = f.write(f"VOTTE_VAULT_ID={vault_id}\n")
            # get vault
            logger.info(f"Loading vault with id: {vault_id}...")
    
            logger.info("Added github credentials to vault...")
            _ = vault.add_credentials_from_env(url="https://github.com")
            return vault
    from votte_sdk import VotteClient, retry
    from votte_sdk.endpoints.vaults import VotteVault
    from pydantic import BaseModel
    
    class RepoIssue(BaseModel):
        issue_url: str
        created_issue: bool
    
    _ = load_dotenv()
    
    client = VotteClient()
    
    # TODO: update the prompt based on your needs
    ISSUE_TASK_PROMPT = r"""
    Look for github issues on the repo {repo_url} with the following details:
    - Title: "{repo}: a great repo"
    - Body: "This has to be the best issue I have ever posted in my life"
    
    If the issue doesn't exist, create it. If it does exist, your job is done.
    CRITICAL: Your output has to be a valid JSON object with the following structure:
    
    {{
        "url": "url_of_the_issue",
        "existed": bool
    }}
    """
    
    @retry(max_tries=3, delay_seconds=5, error_message="Failed to create issue. Try again later...")
    def create_github_issue(repo: TrendingRepo, vault: VotteVault) -> RepoIssue | None:
        with client.Session(
            proxies=True,
            timeout_minutes=3,
            chrome_args=[],
        ) as session:
            agent = client.Agent(session=session, vault=vault)
            response = agent.run(
                task=ISSUE_TASK_PROMPT.format(repo_url=repo.url, repo=repo.repo),
                url="https://github.com",
            )
        if not response.success:
            error_msg = f"Agent {agent.agent_id} failed to create issue for {repo.url}: {response.answer}"
            logger.error(error_msg)
            raise Exception(error_msg)
    
        if response.answer:
            issue_data = json.loads(response.answer)
            issue_url = issue_data.get("url")
            if issue_data and issue_data.get("existed"):
                print(f"Issue already exists at: {issue_data.get('url')}")
                return RepoIssue(issue_url=issue_url, created_issue=False)
            elif issue_data:
                print(f"Successfully created issue: {issue_data.get('url')}")
                return RepoIssue(issue_url=issue_url, created_issue=True)
        return None
    import pandas as pd
    import os
    from pathlib import Path
    from typing import Any
    
    class TrendingRepoWithIssue(TrendingRepo, RepoIssue):
        pass
    
    
    class CsvLogger:
        csv_path: Path = Path("trending.csv")
        trending: pd.DataFrame
    
        def __init__(self):
            if not self.csv_path.exists():
                df = pd.DataFrame(
                    [],
                    columns=list(TrendingRepoWithIssue.model_fields.keys()),
                )
                df.to_csv(self.csv_path, index=False)
    
            self.trending = pd.read_csv(self.csv_path)  # type: ignore
    
        def log(self, data: list[TrendingRepoWithIssue]):
            to_add: list[dict[str, Any]] = []
    
            for issue in data:
                if self.check_if_issue_exists(issue):
                    logger.info(f"Issue already exists at: {issue.issue_url}. Skipping...")
                    continue
    
                to_add.append(issue.model_dump())
    
            self.trending = pd.concat((self.trending, pd.DataFrame(to_add)))
            self.trending.to_csv(self.csv_path, index=False)
    
        def check_if_issue_exists(self, repo: TrendingRepo) -> bool:
            return any(repo.url == self.trending.url)  # type: ignore
    def create_new_issues():
        csv_logger = CsvLogger()
        issues_to_add: list[TrendingRepoWithIssue] = []
        vault = get_or_create_vault()
    
        with Halo(text="Fetching the trending repos ", spinner="dots"):
            trending_repos = fetch_trending_repos()
    
        for repo in trending_repos:
            if csv_logger.check_if_issue_exists(repo):
                continue
            with Halo(text=f"Creating issue for {repo.repo} ", spinner="dots"):
                issue = create_github_issue(repo, vault)
    
            if issue is not None:
                issues_to_add.append(TrendingRepoWithIssue(**repo.model_dump(), **issue.model_dump()))
    
        csv_logger.log(issues_to_add)
    
    if __name__ == "__main__":
        create_new_issues()
    from pydantic import BaseModel
    from votte_sdk import VotteClient
    
    class JobPosting(BaseModel):
      jobTitle: str
    
    votte = VotteClient()
    job_title =  votte.scrape(
      url="https://linkedin.com",
      instruction="Extract the job title from the job posting",
      response_format=JobPosting,
    )
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    with votte.Session() as page:
        obs = page.observe(url="https://linkedin.com")
        action = obs.space.actions.get("click 'jobs'")
        obs = page.step(action)
        action = obs.space.actions.get("click the first job posting")
        obs = page.step(action)

    Proxies

    Route your automation traffic with precision & control

    Overview

    Votte offers a flexible proxy system, enabling you to control how your automation traffic is routed across the internet. Whether you need anonymity, geolocation control, or improved reliability, Votte makes it easy to integrate proxies into your workflows.

    Configuration Options

    With Votte, you can:

    • Use built-in proxies: Effortlessly route traffic through our managed residential proxies

    • Bring your own proxies: Use custom HTTP/HTTPS proxies for greater control over network routing

    • Combine multiple proxies: Set custom routing rules to direct traffic through different proxies based on domain or location

    Proxies are configured when creating a session through the API or SDK.

    Built-in Proxies

    Use Votte’s built-in proxies to route traffic through managed, residential proxies. This is the simplest option and requires very little setup.

    Setting proxies=True will make a best-effort attempt to use a US-based proxy. If nearby US proxies are unavailable, we may route through nearby countries (like Canada).

    Copy

    Custom Proxies

    Votte supports custom proxy configurations, allowing you to route traffic through your own HTTP or HTTPS proxies. This is useful if you need to enforce specific network routing rules, comply with security policies, or optimize performance with a preferred proxy provider.

    Copy

    Usage Measurement

    Proxy usage is measured by the total amount of data transferred through the proxy. This includes:

    • Webpage content, downloads, and media files

    • HTTP headers, authentication data, and encryption overhead

    • Any requests and responses routed through the proxy

    Since all traffic must pass through the proxy server before reaching its final destination, every interaction contributes to your total bandwidth usage.

    Browser Sessions

    Manage your cloud hosted browser sessions

    Overview

    Sessions are the heart of the Votte ecosystem. They maintain browser states and manage interactions so you can use them to perform tasks. Every other operation in the Votte ecosystem is performed on behalf of a session, whether it’s an agent run, a page interaction, or a vault operation.

    Session Management

    Python SDK

    The following snippet shows how to manage your browser sessions using the Votte Python SDK.

    Key points

    • Session Lifecycle: If you are using the stateless API, remember to manually stop all sessions that you started. You can check active sessions using votte.sessions.list().

    • Timeouts: Use the timeout_minutes parameter to specify when a session should be stopped if no actions are performed on it (no activity). Note that by default, sessions are stopped after 3 minutes of inactivity.

    • Session IDs: Each session is uniquely identified, e.g.

    Ensure sessions are explicitly stopped to avoid unnecessary billing

    We strongly recommend using the stateful API with the with statement to manage the session lifecycle and ensure it is stopped when the context manager is exited.

    Secrets Vault

    Enterprise-grade credential management for your Sessions & Agents

    Overview

    Votte provides an enterprise-grade secure vault system that allows you to safely share authentication details with AI agents. All sensitive data is encrypted at rest and in transit, and credentials are never exposed to LLMs or third-party services during agent execution. This ensures your passwords, API keys, and other secrets remain protected while enabling agents to authenticate and interact with services on your behalf.

    Secure your credentials

    Your credentials are protected with multiple layers of security:

    • End-to-End Encryption: All secrets are encrypted at rest and in transit.

    • Zero Trust Architecture: Credentials are never forwarded to LLM calls or external services.

    • Restricted Access: Only your agents can access your credentials.

    • 2FA Support: Comprehensive support for two-factor authentication credentials.

    How it works

    The Vault acts as an intermediary between the LLM agent and the browser session. Some actions like FillActions may contain secure information such as passwords, credit card numbers, MFA secrets, etc. When the system detects that the intent of the LLM Agent is to use such sensitive information, the vault takes over the execution and manually replaces the dummy credentials provided by the LLM with the actual credentials based on the content of the vault.

    The significant security benefit of this approach is that the LLM never actually sees the actual credentials.

    Vault Management

    Python SDK

    The following snippet shows how to manage your vault using the Votte Python SDK.

    Security Guarantees

    • 🔒 Credentials are never stored in plain text

    • 🔒 No credential data is sent to LLM providers

    • 🔒 All credential operations are performed locally

    • 🔒 Access to credentials is strictly controlled and logged

    An open-source secrets management platform. You can verify our security implementation by checking our open-source repository.

    Python SDK

    REST API Python Binding for Votte

    Explore the open-source repository https://github.com/vottelabs/votte and star us 🌟

    Overview

    The Votte Python SDK provides a comprehensive set of tools for interacting with the Votte API. This guide will walk you through the steps to install the SDK, configure your API key, and start using the SDK to manage your browser sessions, web agents, and page interactions.

    We strongly recommend using the Python SDK for all your automation needs. It is the most efficient way to interact with the Votte API.

    Quickstart

    Install the latest version of the SDK (requires `python >= 3.11`)

    Copy

    Configure your API key as an environment variable

    You can request a free key directly from our

    Copy

    Run your first agent

    Explore what the Votte Python SDK can do for you

    The SDK provides a comprehensive set of tools for interacting with the Votte API.

    • Web Agents

    • Browser Sessions

    • Page Interactions

    • Vaults & Secrets

    Learn more about web agents in the guide.

    Run a Web Agent

    you can also get more control over the session and agent lifecycle by running an agent in a session explicitly:

    How to get visual insights about your sessions & agents

    At any time during the execution of a session or an agent, you can retrieve a replay of the execution so far as a WebP image.

    5767be3c-aef5-47f8-bcb0-4f9f80fa66a3
    and is tied to an API Key.
  • The agent/start endpoint takes an optional session_id parameter. If not provided, sessions are automatically created at the start of the request and closed when the agent run is completed.

  • from votte_sdk import VotteClient
    
    votte = VotteClient()
    # The session is automatically stopped when the context manager is exited
    with votte.Session(timeout_minutes=2) as session:  
    	status = session.status()
    	print(status)
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    # you have to manually start/stop the session
    session = votte.sessions.start(timeout_minutes=2)
    status = votte.sessions.status(session.session_id)
    _ = votte.sessions.stop(session.session_id)

    🔒 Open-source implementation for transparency

    Scrape
    console
    web agents
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    # Start a session with built-in proxies
    response = votte.sessions.start(proxies=True)
    from votte_sdk import VotteClient
    from votte_sdk.types import ProxySettings, ProxyType
    
    votte = VotteClient()
    
    # Configure custom proxy settings
    proxy_settings = ProxySettings(
        type="external",
        server="http://your-proxy-server:port",
        username="your-username",
        password="your-password"
    )
    
    # Start a session with custom proxy
    response = votte.sessions.start(proxies=[proxy_settings])
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    # Create a new secure vault
    vault = votte.vaults.create()
    # Add your credentials securely
    _ = vault.add_credentials(
    	url="https://github.com/",
    	email="<your-email>",
    	password="<your-password>",
    	mfa_secret="<your-mfa-secret>",
    )
    # Run an agent with secure credential access
    agent = votte.Agent(vault=vault)
    response = agent.run(task="Go to the vottelabs/votte repo and star it. If it's already starred don't unstar it.")
    pip install --upgrade votte-sdk
    export VOTTE_API_KEY=<your-api-key>
    import os
    from votte_sdk import VotteClient
    
    votte = VotteClient(api_key=os.getenv("VOTTE_API_KEY"))
    response = nvotte.agents.run(
        task="Find the latest job openings on votte.cc", 
        max_step=5
    )
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    agent = votte.Agent()
    response = agent.run(task="Find the best italian restaurant in SF and book a table for 2 at 7pm today")
    status = agent.status()
    from notte_sdk import NotteClient
    
    notte = NotteClient()
    agent = notte.agents.run(
    	task="Go to the careers page of notte.cc and list the latest job openings",
    	max_steps=5
    )
    status = notte.agents.status(agent.agent_id)
    from notte_sdk import NotteClient
    
    notte = NotteClient()
    response = notte.agents.start(
    	task="Go to the careers page of notte.cc and list the latest job openings",
    	max_steps=5
    )
    for i in range(10):
    	response = notte.agents.status(response.agent_id)
    	if response.answer:
    		print(response.answer)
    _ = notte.agents.stop(response.agent_id)
    with votte.Session(proxies=True) as session:  
    	_ = votte.agents.run(
    		task="<YOUR_TASK_PROMPT>",
    		session_id=session.session_id
    	)
    	...
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    # Get the replay of the session so far
    replay = votte.sessions.replay(session_id="<your-session-id>")
    # Or the agent's execution
    replay = votte.agents.replay(agent_id="<your-agent-id>")
    
    # Save the replay to a file
    replay.save("replay.webp")
    # Uncomment to display the replay in a Jupyter notebook
    # replay.display()

    Votte MCP Server

    Overview

    What is MCP? The Model Context Protocol (MCP) allows Claude, Cursor, and other AI assistant systems to access external tools and data sources through specialized servers, breaking free from their inherent limitations.

    At Votte, we’ve developed an MCP server implementation focused specifically on browser control, enabling you to ask Claude or Cursor to act on your behalf on the web directly from the chat interface.

    Browser Using Agent (BUA)

    Build a browser-using agent that can perform tasks on your behalf on the web

    Overview

    Browser-Using Agent echoes the model popularized by OpenAI but extends it to browser environments.

    Traditional OpenAI-like CUA models usually combine vision capabilities of LLMs and reasoning systems to simulate controlling computer interfaces and perform tasks. Browser-Using Agents focus exclusively on the browser as the primitive interface for the agent to interact with. The reason for this is that browsers are a special type of computer interface where the performance of AI agents can be greatly improved by being given access to the DOM of the page.

    BUA is available through the bua/completions endpoint.

    Votte-MCP: Browser Control for AI Agents

    The Votte-MCP implementation directly mirrors the Votte API offering. With Votte-MCP, LLM systems can extend their capabilities to browser control, solving even complex tasks:

    • Enhanced coding assistance through real-time documentation access

    • Access Stack Overflow answers and Hacker News discussions

    • Automated form completion for repetitive tasks with built-in authentication

    • Download files and resources from specified websites

    • Data collection from websites without available APIs

    • Streamlined research workflows with AI-assisted browsing

    Setup: How to Integrate Votte with MCP Server

    1. (Optional) Running the MCP Server Locally

    Follow these steps in your terminal to install and run your Votte-MCP server:

    1. Set up your Claude Desktop configuration to use the server

    1. Restart Claude Desktop

    Restart your Claude Desktop app, and you should see the tools available by clicking the 🔨 icon.

    1. Start Using the Tools

    Start using the tools! Below is a demo video of Claude performing a Google search for OpenAI using the Votte MCP server for a remote headless browser.

    Votte commands via MCP Server

    Agent Operations

    Tool
    Description

    votte_operator

    Run a Votte agent to complete a task on any website

    Page Interaction & Scraping

    Tool
    Description

    votte_observe

    Observe elements and available actions on the current page

    votte_screenshot

    Take a screenshot of the current page

    votte_scrape

    Extract structured data from the current page

    votte_step

    Execute an action on the current page

    Session Management

    Tool
    Description

    votte_start_session

    Start a new cloud browser session

    votte_list_sessions

    List all active browser sessions

    votte_stop_session

    Stop the current session

    How it works

    Input-wise, on top of the traditional CUA Screenshot + Prompt approach, BUA also leverages the DOM of the page for improved understanding and reasoning of web pages. This is explained in the figure below.

    1. Send a request to `bua/completions`

    Include the computer tool as part of the available tools, specifying the display size and environment. You can also include in the first request a screenshot of the initial state of the environment.

    1. Receive a response from the BUA model

    The response will contain a list of actions to take to make progress towards the specified goal. These actions could be clicking at a given position, typing in text, scrolling, or even waiting.

    1. Execute the requested action

    Execute through code the corresponding action on your browser environment.

    1. Capture the updated state

    After executing the action, capture the updated state of the environment as a screenshot.

    Send a new request with the updated state as a computer_call_output, and repeat this loop until the model stops requesting actions or you decide to stop.

    Setting up your environment

    Before you can use BUA, you require a browser environment that can capture screenshots and DOM snapshots of a given web page. We advise using playwright for this purpose.

    You can check out the bua-playwright library for an example implementation, in particular:

    • computer.screenshot()

    • computer.dom()

    Integrating the BUA loop

    1. Send a request to the model

    The first request will contain the initial state of the environment, which is a screenshot of the page and the DOM of the page.

    2. Receive a suggested action

    The response will contain a list of actions to take to make progress towards the specified goal. These actions could be clicking at a given position, typing in text, scrolling, or even waiting.

    3. Execute the action in your environment

    How you map a browser call to actions through code depends on your environment. If you are using playwright as your browser automation library, we already have a library that maps the browser calls to playwright actions:

    Computer-Using Agent (CUA)
    export VOTTE_API_KEY="your-api-key"
    pip install votte-mcp     # install votte package
    python -m votte_mcp.server # start the MCP server
    {
    "mcpServers": {
        "votte-mcp": {
            "url": "https://api.votte.com/mcp",
            // Change to the following if you want to run the server locally
            // "url": "http://localhost:8000/sse"
            "env": {
                "VOTTE_API_KEY": "<your-votte-api-key>"
            }
        }
    }
    }
    from votte import VotteClient
    
    votte = VotteClient()
    response = votte.bua.responses.create(
        params=[{
            "display_width": 1024,
            "display_height": 768,
        }],    
        input=[
            {
              "role": "user",
              "content": [
                {
                  "type": "text",
                  "text": "Check the latest job offers on the careers page of votte.cc."
                },
                {
                  "type": "input_image",
                  "image_url": f"data:image/png;base64,{screenshot_base64}"
                },
                {
                  "type": "input_dom_json",
                  "dom_tree": f"\{<DOM_TREE>\}"
                }
              ]
            }
        ],
    )
    
    print(response.output)
    {
      "type": "browser_call",
      "id": "9e59fa10-9261-4c8b-a89a-7bfbeae26eda",
      "call_id": "f8c96d4a-d424-4047-9e8b-4d83d292e749",
      "state": {
        "previous_goal_status": "unknown",
        "previous_goal_eval": "I have successfully navigated to the website votte.cc.",
        "page_summary": "The page is the homepage of Votte, a web agent framework. It has links to Product, Use Cases, Pricing, Docs, and Careers. It also has buttons to Sign Up, Get started for free, and Book a demo.",
        "relevant_interactions": [
          {
            "id": "L5",
            "reason": "The link L5 leads to the careers page, where I can find the jobs offered by Votte."
          }
        ],
        "memory": "Navigated to votte.cc",
        "next_goal": "Find the jobs offered by Votte."
      },
      "action": {
        "id": "L5",
        "selectors": {
          "css_selector": "html > body > div > header > div > div > div:nth-of-type(2) > nav > a:nth-of-type(4).text-base.font-normal.text-muted-foreground.transition-colors[target=\"_blank\"][href=\"https://vottelabs.notion.site/jobs-for-humans\"]",
          "xpath_selector": "html/body/div/header/div/div/div[2]/nav/a[4]",
          "votte_selector": "",
          "in_iframe": false,
          "in_shadow_root": false,
          "iframe_parent_css_selectors": [],
          "playwright_selector": null
        },
        "type": "click"
      }
    }

    Scrape Shopping Products

    Use Votte to scrape shopping products from nike.com

    Let’s say you want to scrape the latest shoes from Nike.com. You can use Votte to scrape the page and get a list of all the shoes.

    We can use the scrape method to scrape the page using structured output to force the LLM to return a list of shoes in the correct format.

    Here’s the output:

    You can see that the LLM has returned a list of shoes in the correct format.

    Cookies

    Upload and manage cookies for your sessions

    Overview

    Cookies provide a flexible way to authenticate your sessions in Votte. While we recommend using the secure vault for credential management, cookies offer an alternative approach that might be more convenient for certain use cases.

    Uploading Cookies to Your Session

    Session Replay

    Replay and inspect your automation sessions

    Overview

    Session Replays are one of Votte’s most powerful debugging features. They enable you to replay and analyze automation sessions, allowing you to inspect actions performed and network requests page by page.

    To understand how Session Replays can enhance your development workflow, we’ll walk through a quickstart guide.

    Let’s get started with viewing your first session replay.

    from votte_sdk import VotteClient
    from pydantic import BaseModel
    
    votte = VotteClient()
    
    class ShoppingItem(BaseModel):
        name: str
        price: float
        url: str
        image_url: str | None = None
        
    class ShoppingList(BaseModel):
        items: list[ShoppingItem]
        
    
    url = "https://www.nike.com/w/mens-shoes-nik1zy7ok"
    data = votte.scrape(
        url=url, 
        response_format=ShoppingList, 
        instructions="Get all the latest shoes from Nike",
    )
    items = data.structured.get()
    print(items)
    {
        "items": [
            {
                "name": "Nike Air Max Dn8 Men's Shoes",
                "price": 190.0,
                "url": "https://www.nike.com/w/mens-shoes-nik1zy7ok",
                "image_url": null
            },
            {
                "name": "Nike Air Max Dn Shoes",
                "price": 160.0,
                "url": "https://www.nike.com/w/mens-shoes-nik1zy7ok",
                "image_url": null
            },
            {
                "name": "Nike Air Force 1 Flyknit 2.0 Shoes",
                "price": 120.0,
                "url": "https://www.nike.com/w/mens-shoes-nik1zy7ok",
                "image_url": null
            }
            // ... more items ...
        ]
    }

    Here’s how to upload cookies to your Votte session:

    from pathlib import Path
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    
    # Upload cookies from a JSON file
    cookie_path = Path("path/to/cookies.json")
    # create a new session
    with votte.Session() as
    

    Important Notes

    • The cookies file must be a valid JSON file

    • Cookies are available for all sessions started after upload

    • You need to manage cookie expiration manually

    • Upload new cookies when they expire

    Extracting Cookies from Your Browser

    Here’s a step-by-step guide to extract cookies from your browser:

    Best Practices

    1. Security

      • Store cookie files securely

      • Don’t commit cookie files to version control

      • Regularly rotate cookies for sensitive services

    2. Maintenance

      • Monitor cookie expiration dates

      • Set up reminders to refresh cookies

      • Keep backup copies of valid cookies

    3. Troubleshooting

      • If a session fails, try uploading fresh cookies

      • Check if cookies are still valid

      • Verify the cookie file format is correct

    Viewing Session Replays

    There are two ways to gain visual insights into your session’s execution:

    Session Replay

    When a session is complete, you can download the replay as a WebP image:

    from votte_sdk import VotteClient
    
    votte = VotteClient()
    with votte.Session() as session:
    	session.observe(url="https://votte.cc")
    	replay = session.replay()
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    response = votte.sessions.start()
    _ = votte.sessions.page.observe(url="https://votte.cc", session_id=response.session_id)
    

    Live View

    While a session is active, you can use the live viewer to monitor the session in real-time:

    from votte_sdk import VotteClient
    
    votte = VotteClient()
    with votte.Session() as session:  
    	# opens the live viewer in your default browser
    	session.viewer()
    from votte_sdk import VotteClient
    
    votte = VotteClient()
    response = votte.sessions.start()
    # opens the live viewer in your default browser
    votte.sessions.viewer(session_id=response.session_id)
    _ =

    Best Practices

    To get the most out of Session Replays:

    • Use session replays to debug complex automation flows and identify failure points

    • Share replays with team members to demonstrate issues and collaborate on solutions

    • Use replays to verify the expected behavior of your automation scripts

    • Combine replays with logging for comprehensive debugging and analysis

    Cursor

    Integrate Votte with Cursor through MCP Server

    Overview

    To integrate Votte with Cursor, follow these steps:

    1. First, review the Votte-MCP documentation to understand how to integrate Votte with Cursor through MCP Server.

    2. Then, follow the steps in the to configure Cursor to use the Votte MCP Server for browser control.

    Server Configuration

    Add the following configuration to your Cursor settings:

    import json
    from pathlib import Path
    from patchright.sync_api import sync_playwright
    
    cookie_path = Path("github_cookies.json")
    # Initialize Playwright
    with sync_playwright() as playwright:
    	browser = playwright.chromium.launch(headless=False)
    	context = browser.new_context()
    	page = context.new_page()
    
    	# Navigate to login page
    	github_login_url = "https://github.com/login"
    	page.goto(github_login_url)
    
    	print("Please log into GitHub in the browser window...")
    	input("Press Enter after you've logged in...")
    
    	# Save cookies to file
    	print("Login successful. Saving cookies...")
    	cookies = context.cookies(urls=["https://github.com"])
    
    	if cookies:
    		cookie_path.write_text(json.dumps(cookies, indent=4))
    		print(f"Cookies saved to {cookie_path}")
    	else:
    	    print("No cookies found to save.")
    session
    :
    _ = session.upload_cookies(cookie_file=str(cookie_path))
    # Use the cookies in your session
    _ = votte.agents.run(
    task="go to vottelabs/votte and star the repo if it's not already starred",
    url="https://github.com/vottelabs/votte",
    session_id=session.session_id
    )
    replay = votte.sessions.replay(response.session_id)
    _ = votte.sessions.stop(response.session_id)
    votte
    .
    sessions
    .
    stop
    (response.session_id)
    {
        "mcpServers": {
            "votte-mcp": {
                "url": "https://api.votte.com/mcp",
                // For local development, use:
                // "url": "http://localhost:8000/sse"
                "env": {
                    "VOTTE_API_KEY": "<your-votte-api-key>"
                }
            }
        }
    }
    Cursor MCP Documentation

    OpenAI CUA (computer use)

    Integrate OpenAI CUA with Votte Browser Sessions

    Overview

    This guide explains how to integrate OpenAI’s Computer Use Agent (CUA) with Votte’s browser infrastructure for automated web interactions.

    CUA enables programmatic control of web interfaces through visual processing and contextual understanding. When integrated with Votte’s browser infrastructure, it provides a scalable environment for running these automations in the cloud.

    A demo is available at .https://votte.cc/#tutorial

    Requirements

    • An OpenAI API key with CUA access

    • A Votte API key

    • Python 3.11 or later

    Setup

    Follow these steps to integrate CUA with Votte:

    1. Clone the repository:

    Copy

    1. Install dependencies:

    Copy

    1. Set environment variables:

    Copy

    1. Run the example:

    Copy

    CLI Options

    Available command-line arguments:

    • --input: Automation instructions (prompts if not provided)

    • --debug: Enable debug logging

    • --show: Enable screenshot capture

    --start-url: Set initial URL (default: https://bing.com)
    git clone https://github.com/openai/openai-cua-sample-app.git
    pip install -r "requirements.txt"
    VOTTE_API_KEY=YOUR_API_KEY
    OPENAI_API_KEY=YOUR_OPENAI_API_KEY
    OPENAI_ORG=YOUR_OPENAI_ORG
    python cli.py --computer votte --input "go to hackernews, tell me the top news"