Skip to content

JARVIS AGI || AI Powered Voice Assistant with Real Human Capabilities

License

Notifications You must be signed in to change notification settings

SreejanPersonal/JARVIS-AGI

Repository files navigation

YouTube Telegram Instagram LinkedIn Support on Patreon Buy Me A Coffee

πŸ›‘ Follow This Series Live: Jarvis 2.0 Series


JARVIS-AGI

Project Overview

JARVIS-AGI is an advanced AI project designed to integrate multiple AI capabilities, including speech recognition, text processing, and image analysis, into a cohesive system. Named after the iconic AI assistant from popular culture, Jarvis is built with cutting-edge natural language processing capabilities, allowing users to interact with it through voice commands. Whether it's checking the weather, setting reminders, managing calendars, or searching the web, Jarvis is equipped to handle a wide range of tasks efficiently and effectively. With its intuitive interface and robust functionality, Jarvis aims to revolutionize the way users engage with technology, making everyday tasks simpler and more convenient.

Table of Contents

Features

  • Speech Recognition: Convert spoken language into text using various models.
  • Text Processing: Analyze and generate text with multiple AI tools.
  • Image Analysis: Perform image recognition and processing tasks.
  • Audio Tools: Detect hotwords and manage audio playback interruptions.
  • Interactive Prompts: Predefined prompts to guide AI interactions.

Directory Structure

The project is organized into several key directories:

JARVIS-AGI/
β”œβ”€β”€ .env
β”œβ”€β”€ .env.example
β”œβ”€β”€ .gitattributes
β”œβ”€β”€ .gitignore
β”œβ”€β”€ ASSETS/
β”‚   β”œβ”€β”€ CLAP_DETECTS/
β”‚   β”‚   └── MODELS/
β”‚   β”‚       └── Model.txt
β”‚   β”œβ”€β”€ SOUNDS/
β”‚   β”‚   β”œβ”€β”€ activation_sound.wav
β”‚   β”‚   β”œβ”€β”€ audio_file.mp3
β”‚   β”‚   └── deactivation_sound.wav
β”‚   β”œβ”€β”€ STREAM_AUDIOS/
β”‚   β”‚   β”œβ”€β”€ output_audio_6.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_7.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_8.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_9.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_10.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_11.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_12.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_13.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_14.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_15.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_16.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_17.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_18.mp3
β”‚   β”‚   β”œβ”€β”€ output_audio_19.mp3
β”‚   β”‚   └── output_audio_20.mp3
β”‚   β”œβ”€β”€ USERDATA/
β”‚   β”‚   └── LE CHAT/
β”‚   β”‚       └── How_To_Store_UserData.txt
β”‚   β”œβ”€β”€ Vosk/
β”‚   β”œβ”€β”€ available_working_proxies.txt
β”‚   β”œβ”€β”€ conversation_history.json
β”‚   └── openGPT_IDs.txt
β”œβ”€β”€ BRAIN/
β”‚   β”œβ”€β”€ AI/
β”‚   β”‚   β”œβ”€β”€ IMAGE/
β”‚   β”‚   β”‚   β”œβ”€β”€ decohere_ai.py
β”‚   β”‚   β”‚   └── deepInfra_IMG.py
β”‚   β”‚   β”œβ”€β”€ TEXT/
β”‚   β”‚   β”‚   β”œβ”€β”€ API/
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Blackbox_ai.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Bnn_GPT.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ FarFalle.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Hugging_Face_TEXT.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Le_Chat.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Phind.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Pi_Ai.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Uncensored.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ basedGPT.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ deepInfra_TEXT.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ deepseek_ai.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ hugging_chat.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ liaobots.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ openGPT.py
β”‚   β”‚   β”‚   β”‚   └── openrouter.py
β”‚   β”‚   β”‚   β”œβ”€β”€ LOCAL/
β”‚   β”‚   β”‚   β”‚   └── llama_CPP.py
β”‚   β”‚   β”‚   └── STREAM/
β”‚   β”‚   β”‚       β”œβ”€β”€ basedGPT.py
β”‚   β”‚   β”‚       └── deepInfra_TEXT.py
β”‚   β”‚   └── VISION/
β”‚   β”‚       └── deepInfra_VISION.py
β”‚   └── TOOLS/
β”‚       └── groq_web_access.py
β”œβ”€β”€ ENGINE/
β”‚   β”œβ”€β”€ STT/
β”‚   β”‚   β”œβ”€β”€ DevsDoCode.py
β”‚   β”‚   β”œβ”€β”€ NetHyTech.py
β”‚   β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”‚   └── index.html
β”‚   β”‚   └── vosk_recog.py
β”‚   └── TTS/
β”‚       β”œβ”€β”€ STREAMING/
β”‚       β”‚   β”œβ”€β”€ DeepGram.py
β”‚       β”‚   └── speechify.py
β”‚       β”œβ”€β”€ DeepGram.py
β”‚       β”œβ”€β”€ ElevenLabs.py
β”‚       β”œβ”€β”€ ai_voice.py
β”‚       β”œβ”€β”€ deepAI.py
β”‚       β”œβ”€β”€ edge_tts.py
β”‚       β”œβ”€β”€ hearling.py
β”‚       β”œβ”€β”€ speechify.py
β”‚       └── stream_elements_api.py
β”œβ”€β”€ PLAYGROUND/
β”‚   β”œβ”€β”€ ADB_CALL/
β”‚   β”‚   β”œβ”€β”€ ADB COMMANDS.txt
β”‚   β”‚   β”œβ”€β”€ Details.txt
β”‚   β”‚   β”œβ”€β”€ IMP Commands.txt
β”‚   β”‚   β”œβ”€β”€ Information.txt
β”‚   β”‚   β”œβ”€β”€ android_device_connection_setup.py
β”‚   β”‚   └── make_call.py
β”‚   β”œβ”€β”€ CAMERA/
β”‚   β”‚   └── camera_vision.py
β”‚   β”œβ”€β”€ CLAP_NN/
β”‚   β”‚   β”œβ”€β”€ DATASETS/
β”‚   β”‚   β”‚   └── Informtation.txt
β”‚   β”‚   β”œβ”€β”€ ClapDetector.py
β”‚   β”‚   β”œβ”€β”€ Model_Trainer.py
β”‚   β”‚   β”œβ”€β”€ audio_inference.py
β”‚   β”‚   β”œβ”€β”€ cnn_sound_model.py
β”‚   β”‚   └── load_dataset.py
β”‚   └── WEBSITE_ASSISTANT/
β”‚       β”œβ”€β”€ chrome_latest_url.py
β”‚       └── jenna_reader.py
β”œβ”€β”€ PROMPTS/
β”‚   β”œβ”€β”€ BISECTORS.py
β”‚   β”œβ”€β”€ INSTRUCTIONS.py
β”‚   β”œβ”€β”€ PROMPTS.py
β”‚   └── SYSTEM.py
β”œβ”€β”€ TOOLS/
β”‚   β”œβ”€β”€ AUDIO/
β”‚   β”‚   β”œβ”€β”€ Hotword_Detection.py
β”‚   β”‚   └── Interrupted_Playsound.py
β”‚   β”œβ”€β”€ LE_CHAT_COOKIES/
β”‚   β”‚   └── Cookie_Extractor.py
β”‚   β”œβ”€β”€ SYSTEM_SETTINGS/
β”‚   β”‚   β”œβ”€β”€ SETTING.py
β”‚   β”‚   β”œβ”€β”€ system_theme.py
β”‚   β”‚   └── taskbar.py
β”‚   β”œβ”€β”€ Alpaca_DS_Converser.py
β”‚   β”œβ”€β”€ ProxyAPI.py
β”‚   β”œβ”€β”€ RawDog.py
β”‚   β”œβ”€β”€ TXT_DS_Converser.py
β”‚   β”œβ”€β”€ Web_Results.py
β”‚   └── stream_audio_cleanup.py
β”œβ”€β”€ CODE_OF_CONDUCT.md
β”œβ”€β”€ IMPORTS.py
β”œβ”€β”€ LICENCE
β”œβ”€β”€ Le_Chat_Tester.py
β”œβ”€β”€ Memory ConvoTxt.py
β”œβ”€β”€ SpeedTester.py
β”œβ”€β”€ StreamSpeak.py
β”œβ”€β”€ WebTester.py
β”œβ”€β”€ main.py
β”œβ”€β”€ readme.md
└── requirements.txt

Installation

  1. Clone the repository:

    git clone https://github.com/SreejanPersonal/JARVIS-AGI.git
    cd JARVIS-AGI
  2. Install the required packages:

    pip install -r requirements.txt
  3. (Optional) Install Vosk Speech Recognition Models:

    Vosk provides pre-trained models for various languages. To install the models for your desired language, follow these steps:

    • Go to the Vosk GitHub repository releases page: Vosk GitHub Releases
    • Download the model folder for your language. For example, if you want English models, download the folder named vosk-model-en-us-aspire-0.2.
    • Extract the contents of the folder into a directory named ASSETS in your project directory.
    • Ensure that the extracted model folder is directly under the ASSETS directory, without any additional nesting.
    • Now, you should have a structure like this: <your-main-project-directory>/ASSETS/vosk-model-en-us-aspire-0.2.
    • Modify the main.py or any relevant script to point to the model directory. For example:
      from ENGINE.STT.vosk_recog import speech_to_text
      
      for speech in speech_to_text(model_path="ASSETS/Vosk/vosk-model-small-en-us-0.15"):
          if speech != "":
              print("Human >>", speech)

Usage

  1. Run the main script:

    python main.py
  2. Configuration: Modify API configuration

in the .env directory to suit your needs.

Contributing

We welcome contributions to improve JARVIS-AGI. To contribute, follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Commit your changes (git commit -m 'Add new feature').
  4. Push to the branch (git push origin feature-branch).
  5. Create a new Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Connect with Us

Made With πŸ’“ By - Sree (Devs Do Code)

For any questions or concerns, reach out to us via our social media handles. Our top choice for contact is Telegram: Devs Do Code Telegram


Devs Do Code

Dive into the world of coding with Devs Do Code - where passion meets programming! Make sure to hit that Subscribe button to stay tuned for exciting content!

Pro Tip: For optimal performance and a seamless experience, we recommend using the default library versions demonstrated in this demo. Your coding journey just got even better! Happy coding!


Now you're all set to explore the Devs Do Code's project! Enjoy coding!

YouTube Telegram Instagram LinkedIn Support on Patreon Buy Me A Coffee

About

JARVIS AGI || AI Powered Voice Assistant with Real Human Capabilities

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published