-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vyokky/dev Agent and automator modularization + Learning for demonstration #50
Conversation
vyokky
commented
Apr 2, 2024
- Extract agent module for HostAgent and AppAgent, intergrate into main flow.
- Build automator module for AppAgent, for UI controls and API.
- Learning from human demonstration (Yun Hao)
Enable human demonstration RAG
""" | ||
|
||
user_action_data = re.search( | ||
r'<UserActionData>(.*?)</UserActionData>', content, re.DOTALL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many hard-code implementation like this. Need some regression test to secure, in case the PSR changes its format in future release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on the regression test. Will discuss offline about the implementation.
UFO can learn from user-provided demonstrations for specific requests and use them as references in the future when encountering similar tasks. Providing clear demonstrations along with precise requests can significantly enhance UFO's performance. | ||
|
||
## How to Enable and Config this Function ❓ | ||
You can enable this function by setting the following configuration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to mention which configuration file as we have two in UFO to set these configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will update.
screenshot = self.get_screenshot(screenshot_file_name) | ||
step_key = f"step_{int(action_number) - 1}" | ||
|
||
step = DemonstrationStep( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the fields may not be extracted correctly in some cases (e.g., PSR changes it format). May need sanity check on the values.
|
||
return record | ||
|
||
def find_boundary(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are private methods, could use private function definition.
def find_boundary(self) -> str: | |
def __find_boundary(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will update
summary = dict() | ||
summary["example"] = {} | ||
for key in ["Observation", "Thought", "ControlLabel", "ControlText", "Function", "Args", "Status", "Plan", "Comment"]: | ||
summary["example"][key] = response_json.get(key, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we use empty string if the key is not in the set? May add some comments on this.
self.logger = self.initialize_logger(self.log_path, "response.log") | ||
self.request_logger = self.initialize_logger(self.log_path, "request.log") | ||
|
||
self.app_selection_prompter = ApplicationAgentPrompter(configs["APP_AGENT"]["VISUAL_MODE"], configs["APP_SELECTION_PROMPT"], configs["APP_SELECTION_EXAMPLE_PROMPT"], configs["API_PROMPT"]) | ||
self.act_selection_prompter = ActionAgentPrompter(configs["ACTION_AGENT"]["VISUAL_MODE"], configs["ACTION_SELECTION_PROMPT"], configs["ACTION_SELECTION_EXAMPLE_PROMPT"], configs["API_PROMPT"]) | ||
self.HostAgent = HostAgent("HostAgent", configs["HOST_AGENT"]["VISUAL_MODE"], configs["HOSTAGENT_PROMPT"], configs["HOSTAGENT_EXAMPLE_PROMPT"], configs["API_PROMPT"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have some way to protect the value errors in config class. In case some config items are missing.
@@ -81,113 +74,118 @@ def process_application_selection(self): | |||
""" | |||
|
|||
# Code for selecting an action |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is too long to read. May split it into several sub functions, to reveal the logic steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Will update the func and break it into different sub-steps.
ufo/module/flow.py
Outdated
|
||
if self.safe_guard(action, control_text): | ||
# Execute the action | ||
results = ui_controller.execution(function_call, args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UI controller to execute function call is wired. These function calls are actually operations instead of API calls, right? What about use different name here?
@@ -1,31 +1,28 @@ | |||
# Copyright (c) Microsoft Corporation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think this file is too long. Some functions could be split and put into different files/classes. Otherwise, it is not easy to read and follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Perhaps will break it down for next update.