Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vyokky/dev Agent and automator modularization + Learning for demonstration #50

Merged
merged 49 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
92f354e
Add record processor for the user demonstration learning
yunhao0204 Mar 26, 2024
398a778
Polish code and enable UFO Rag from user demostration
yunhao0204 Mar 27, 2024
73e91cf
Add README and related sample files
yunhao0204 Mar 27, 2024
b79e5f2
basic agent classes
vyokky Mar 28, 2024
1c5fa1d
basic classes
vyokky Mar 28, 2024
db478e9
import refine
vyokky Mar 31, 2024
db0c127
agent framework
vyokky Mar 31, 2024
a41a171
agent framework
vyokky Mar 31, 2024
d29bf1d
log and memory management
vyokky Apr 1, 2024
23e42c0
plugin
vyokky Apr 1, 2024
8a8536f
plugin
vyokky Apr 1, 2024
1c7ad6f
Enable record_processor return multiple plans to let user choose
yunhao0204 Apr 1, 2024
0ff8289
plugin basic
vyokky Apr 1, 2024
8df4178
plugin basic
vyokky Apr 1, 2024
d2ca2c4
refinement
vyokky Apr 1, 2024
fc3fb33
refinement
vyokky Apr 1, 2024
8483a1b
refinement
vyokky Apr 1, 2024
40cb6d9
app puppeteer
vyokky Apr 2, 2024
8c8f81b
app puppeteer
vyokky Apr 2, 2024
7531be5
sorted imported
vyokky Apr 2, 2024
923ee6c
name change
vyokky Apr 2, 2024
833b6e2
name change
vyokky Apr 2, 2024
3b87cca
Fix the json parse issue and update README
yunhao0204 Apr 2, 2024
654da0f
readme for offline RAG
vyokky Apr 2, 2024
d083ebf
readme for offline RAG
vyokky Apr 2, 2024
a530d84
puppeteer init
vyokky Apr 2, 2024
29c2e22
puppeteer init
vyokky Apr 2, 2024
4fb0937
Merge branch 'vyokky/dev' into demonstration
yunhao0204 Apr 2, 2024
c11b097
Merge pull request #49 from yunhao0204/demonstration
vyokky Apr 2, 2024
658a86d
agent name change
vyokky Apr 2, 2024
eb96de9
config fixed
vyokky Apr 2, 2024
9ca9b2b
config fixed
vyokky Apr 2, 2024
3c108b8
error trace
vyokky Apr 2, 2024
f89be86
rm redundant
vyokky Apr 2, 2024
5adb714
merge
vyokky Apr 2, 2024
25d5923
bug fix
vyokky Apr 2, 2024
733e282
bug fix
vyokky Apr 2, 2024
1149b02
color change
vyokky Apr 2, 2024
988f110
readme
vyokky Apr 2, 2024
86f4213
rm redundant
vyokky Apr 2, 2024
55dfb83
rm old prompt
vyokky Apr 2, 2024
591f945
sort imports
vyokky Apr 2, 2024
8b8f464
check com exist
vyokky Apr 2, 2024
c5da96f
comment and docstring
vyokky Apr 3, 2024
e7f734a
comment and docstring
vyokky Apr 3, 2024
7f5c58c
fix comment
vyokky Apr 7, 2024
123fa88
fix comment
vyokky Apr 8, 2024
718e74b
fix agent register
vyokky Apr 8, 2024
adc8b13
memory fix
vyokky Apr 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Polish code and enable UFO Rag from user demostration
  • Loading branch information
yunhao0204 committed Mar 27, 2024
commit 398a778b31b7884764e7a2caf39567a7ac391f3a
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,6 @@ vectordb/demonstration/*

# Don't ignore the example files
!vectordb/docs/example/
!vectordb/demonstration/example.yaml

.vscode
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ RAG_ONLINE_RETRIEVED_TOPK: 1 # The topk for the online retrieved documents
Adjust `RAG_ONLINE_SEARCH_TOPK` and `RAG_ONLINE_RETRIEVED_TOPK` to get better performance.


#### RAG from Self-Demonstration
#### RAG from Previous-Experience
Save task completion trajectories into UFO's memory for future reference. This can improve its future success rates based on its previous experiences!

After completing a task, you'll see the following message:
Expand All @@ -157,6 +157,15 @@ RAG_EXPERIENCE: True # Whether to use the RAG from its self-experience.
RAG_EXPERIENCE_RETRIEVED_TOPK: 5 # The topk for the offline retrieved documents
```

#### RAG from User-Demonstration
Boost UFO's capabilities through user demonstration! Utilize Microsoft Steps Recorder to record step-by-step processes for achieving specific tasks. With a simple command processed by the record_processor (refer to the [README](./record_processor/README.md)), UFO can store these trajectories in its memory for future reference, enhancing its learning from user interactions.

You can enable this function by setting the following configuration:
```bash
## RAG Configuration for demonstration
RAG_DEMONSTRATION: True # Whether to use the RAG from its user demonstration.
RAG_DEMONSTRATION_RETRIEVED_TOPK: 5 # The topk for the offline retrieved documents
```


### 🎉 Step 4: Start UFO
Expand Down
22 changes: 14 additions & 8 deletions record_processor/parser/demonstration_record.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ def __init__(self, application: str, description: str, action: str, screenshot:
self.comment = comment
self.screenshot = screenshot


class DemonstrationRecord:
"""
Class for the user demonstration record.
Expand All @@ -28,27 +27,34 @@ def __init__(self, applications: list, step_num: int, **steps: DemonstrationStep
"""
Create a new Record.
"""
self.request = ""
self.round = 0
self.applications = applications
self.step_num = step_num
self.__request = ""
self.__round = 0
self.__applications = applications
self.__step_num = step_num
# adding each key-value pair in steps to the record
for index, step in steps.items():
setattr(self, index, step.__dict__)

def set_request(self, request: str):
"""
Set the request.
"""
self.request = request
self.__request = request

def get_request(self) -> str:
"""
Get the request.
"""
return self.request
return self.__request

def get_applications(self) -> list:
"""
Get the application.
"""
return self.applications
return self.__applications

def get_step_num(self) -> int:
"""
Get the step number.
"""
return self.__step_num
2 changes: 1 addition & 1 deletion record_processor/summarizer/summarizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ def create_or_update_vector_db(summaries: list, db_path: str):
# Check if the db exists, if not, create a new one.
if os.path.exists(db_path):
prev_db = FAISS.load_local(
db_path, embeddings, allow_dangerous_deserialization=True)
db_path, embeddings)
db.merge_from(prev_db)

db.save_local(db_path)
Expand Down
4 changes: 3 additions & 1 deletion ufo/config/config.yaml.template
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,9 @@ RAG_ONLINE_RETRIEVED_TOPK: 1 # The topk for the online retrieved documents
RAG_EXPERIENCE: True # Whether to use the RAG from its self-experience.
RAG_EXPERIENCE_RETRIEVED_TOPK: 5 # The topk for the offline retrieved documents


## RAG Configuration for demonstration
RAG_DEMONSTRATION: True # Whether to use the RAG from its user demonstration.
RAG_DEMONSTRATION_RETRIEVED_TOPK: 5 # The topk for the offline retrieved documents



44 changes: 38 additions & 6 deletions ufo/module/flow.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ def __init__(self, task):
self.offline_doc_retriever = None
self.online_doc_retriever = None
self.experience_retriever = None
self.demonstration_retriever = None
self.control_reannotate = None

welcome_text = """
Expand Down Expand Up @@ -177,7 +178,11 @@ def process_application_selection(self):
experience_path = configs["EXPERIENCE_SAVED_PATH"]
db_path = os.path.join(experience_path, "experience_db")
self.experience_retriever = retriever_factory.ExperienceRetriever(db_path)

if configs["RAG_DEMONSTRATION"]:
print_with_color("Creating an demonstration indexer...", "magenta")
demonstration_path = configs["DEMONSTRATION_SAVED_PATH"]
db_path = os.path.join(demonstration_path, "demonstration_db")
self.demonstration_retriever = retriever_factory.DemonstrationRetriever(db_path)

time.sleep(configs["SLEEP_TIME"])

Expand Down Expand Up @@ -236,13 +241,24 @@ def process_action_selection(self):
screenshot_url = encode_image_from_path(screenshot_save_path)
screenshot_annotated_url = encode_image_from_path(annotated_screenshot_save_path)
image_url += [screenshot_url, screenshot_annotated_url]


examples = []
tips = []
if configs["RAG_EXPERIENCE"]:
examples, tips = self.rag_experience_retrieve()
experience_examples, experience_tips = self.rag_experience_retrieve()
else:
examples = []
tips = []

experience_examples = []
experience_tips = []

if configs["RAG_DEMONSTRATION"]:
demonstration_examples, demonstration_tips = self.rag_demonstration_retrieve()
else:
demonstration_examples = []
demonstration_tips = []

examples += experience_examples + demonstration_examples
tips += experience_tips + demonstration_tips

action_selection_prompt_system_message = self.act_selection_prompter.system_prompt_construction(examples, tips)
action_selection_prompt_user_message = self.act_selection_prompter.user_content_construction(image_url, self.request_history, self.action_history,
control_info, self.plan, self.request, self.rag_prompt(), configs["INCLUDE_LAST_SCREENSHOT"])
Expand Down Expand Up @@ -417,7 +433,23 @@ def experience_saver(self):
self.cost += total_cost
print_with_color("The experience has been saved.", "cyan")

def rag_demonstration_retrieve(self):
"""
Retrieving demonstration examples for the user request.
:return: The retrieved examples and tips string.
"""

# Retrieve demonstration examples. Only retrieve the examples that are related to the current application.
demonstration_docs = self.demonstration_retriever.retrieve(self.request, configs["RAG_DEMONSTRATION_RETRIEVED_TOPK"])

if demonstration_docs:
examples = [doc.metadata.get("example", {}) for doc in demonstration_docs]
tips = [doc.metadata.get("Tips", "") for doc in demonstration_docs]
else:
examples = []
tips = []

return examples, tips
def set_new_round(self):
"""
Start a new round.
Expand Down
4 changes: 2 additions & 2 deletions ufo/prompter/demonstration_prompter.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def user_content_construction(self, demo_record: DemonstrationRecord) -> list[di
})

# Get the total steps of the demonstration record. And construct the agent trajectory.
step_num = demo_record.step_num
step_num = demo_record.get_step_num()

user_content.append({
"type": "text",
Expand All @@ -94,7 +94,7 @@ def user_content_construction(self, demo_record: DemonstrationRecord) -> list[di
# Add the user request.
user_content.append({
"type": "text",
"text": self.user_prompt_construction(demo_record.__getattribute__("request"))
"text": self.user_prompt_construction(demo_record.get_request())
})

return user_content
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ system: |-
- You are required to response in a JSON format, consisting of 10 distinct parts with the following keys and corresponding content:
{{"Observation": <Describe the initial screenshot of the application window in detail, including observations about the application's status relevant to the user request.>
"Thought": <Outline the logic behind the first action required to fulfill the request.>
"ControlLabel": <Specify the precise annotated label of the control item to be selected at the first step. If none of the control items are suitable or the task is complete, output an empty string.>
"ControlLabel": <Specify the precise annotated label of the control item to be selected at the first step. If none of the control items are suitable or the task is complete, output an random number.>
"ControlText": <Specify the precise control_text of the control item to be selected at the first step. If none of the control items are suitable or the task is complete, output an empty string ''.>
"Function": <Specify the precise API function name (without arguments) to be called on the control item to complete the user request. Leave it as an empty string "" if no suitable API function exists or the task is complete.>
"Args": <Specify the precise arguments in dictionary format of the selected API function to be called on the control item to complete the user request. Leave it as an empty dictionary {{}} if the API does not require arguments, or no suitable API function exists, or the task is complete.>
Expand Down
26 changes: 26 additions & 0 deletions ufo/rag/retriever_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,32 @@ def get_indexer(self):



class DemonstrationRetriever(Retriever):
"""
Class to create demonstration retrievers.
"""

def __init__(self, db_path) -> None:
"""
Create a new DemonstrationRetriever.
:db_path: The path to the database.
"""
self.indexer = self.get_indexer(db_path)


def get_indexer(self, db_path: str):
"""
Create a demonstration indexer.
:db_path: The path to the database.
"""

try:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
db = FAISS.load_local(db_path, embeddings)
return db
except:
print_with_color("Warning: Failed to load demonstration indexer from {path}.".format(path=db_path), "yellow")
return None



Expand Down
37 changes: 37 additions & 0 deletions vectordb/demonstration/example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
example0:
example:
Observation: The screenshot shows the Microsoft Outlook application with an email
composition window open. The 'To' field is empty, and no email address has been
entered. The subject and body of the email are also blank. The last action taken
was opening the Outlook application.
Thought: Based on the screenshot, the first step is to input the email address
[email protected] into the 'To' field.
ControlLabel: '2'
ControlText: ''
Function: SetText
Args:
text: [email protected]
Status: CONTINUE
Plan: '(1) Input the email address [email protected] into the ''To'' field.

(2) Input the subject of the email. Since the user request is to say hello,
the subject can be ''Hello''.

(3) Input the content of the email. The content should be a friendly greeting,
such as ''Hi there,\nJust wanted to drop a quick note to say hello!\nBest regards.''

(4) Click the Send button to send the email.'
Comment: The user has provided the email address and the content to be sent. The
trajectory shows that the user began to input an incorrect email and content,
which does not match the user request. I will correct this by inputting the
right email address and content.
Tips: '- Ensure the email address is entered correctly in the ''To'' field.

- Use a clear and concise subject for the email.

- Draft the email content in a friendly and professional tone.

- Review the email before sending to avoid any mistakes.'
request: send email to [email protected] and say hello
app_list:
- MSEDGEWEBVIEW2.EXE