Turbocharging Math Problem-Solving with Undatas.io and Qwen-max Model

xll
xllAuthor
Published
7minRead time
Turbocharging Math Problem-Solving with Undatas.io and Qwen-max  Model

In the realm of educational technology, we’re constantly on the lookout for innovative ways to enhance learning and problem-solving, especially when it comes to mathematics. Today, I’m excited to share with you a remarkable code recipe that has been meticulously crafted to revolutionize the way we handle math problems sourced from PDF test papers.

Our approach involves synergistically integrating the powerful Undatas.io platform with a cutting-edge large language model. This combination holds tremendous potential, and we’ve already started exploring its capabilities in previous demonstrations.

In the preceding blog, we offered a preliminary look at the problem-solving efficacy by using elementary school math test paper questions. It was like a sneak peek into what this powerful duo could achieve. Now, we’re taking it up a notch. In this new example, we’ll be employing a junior high school math test paper from the American “Math League” Mind Exploration. This will allow us to further showcase the true capabilities and vast potential of this integrated approach.

Let’s dive into the essential steps that we’ll be following in this blog:

Step 1: Bootstrapping with Undatas.io Setup

The foundation of our process lies in setting up the Undatas.io platform properly. Here are the key aspects:

Install the Undatasio Python API library

This is the first crucial step towards unlocking the power of Undatas.io. By installing the library, we create the pathway to access and utilize its various features that will be instrumental in our math problem-solving journey.

To commence using UnDatas.io for math test paper analysis, the first step involves installing the UnDatas.io Python API library. Importing an UnDataIO object requires a token and an optional task name obtained from the UnDatas.io platform.

pip install -U -q openai undatasio
from undatasio.undatasio import UnDatasIO

undatasio_obj = UnDatasIO(UNDATASIO_API_KEY, task_name='PdfParserDemo')

Step 2: Extracting Information from Math Test paper

Once we have the Undatas.io platform ready, it’s time to start extracting the relevant data from the math test paper.

Leverage the show_version function of the Undatas.io object

The show_version function serves as a valuable tool in giving us an overview of the data we’re dealing with. It provides crucial insights about the test paper’s structure and content, helping us organize our approach effectively.

The show_version function of the generated UnDatas.io object provides valuable insights by displaying all version information and file lists associated with the current token’s task name. This aids in organizing and understanding the available resources.

version_data = undatasio_obj.show_version()
version_data.data

Harness the get_result_type method

This particular method is a real gem. It enables us to extract a wide range of information from the PDF test papers, be it text, tables, or other relevant elements. For our math problem-solving focus, it will help us precisely identify and isolate the math problems we need to work on.

Moreover, the get_result_type method of the UnDatas.io object is a powerful tool for extracting diverse information from PDF files, such as text, images, tables, titles, and inline equation details. In the context of math test papers, this comprehensive extraction capability is crucial as it can handle the complex mix of text, diagrams, and formulas commonly found in mathematical assessments.

result = undatasio_obj.get_result_type(
    type_info=['title', 'table', 'text', 'image', 'interline_equation'],
    file_name='2017-2018 Math League First Round Grade 7 (Questions & Answers).pdf',
    version='v5'
)
print(result.data)

Step 3: Bridging to Qwen Model via OpenAI

Now that we have the data from the test paper ready, we need to connect it with the Qwen model through OpenAI.

This is a necessary step to establish the link between our setup and the Qwen model. The SDK acts as the bridge that allows us to communicate with and utilize the model’s capabilities for solving our math problems.

the OpenAI Python SDK needs to be installed as it is utilized to call the Qwen model. Initializing the OpenAI object necessitates applying for an Alibaba Cloud API key, ensuring secure and authorized access to the model.

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

Step 4: Solving Math Problems with Qwen-max

Finally, we reach the stage where we put the Qwen-max model to work and solve those math problems.

Configure the Bailian Qwen-max model

We’ll carefully configure the Bailian Qwen-max model by setting appropriate parameters and prompts. This step is crucial as it defines how the model will approach and solve the math problems we’ve extracted from the test paper.

Pinpoint key data: the math problems

Our main goal here is to zero in on the math problems within the test paper. By carefully extracting and identifying these problems, we set the stage for the subsequent steps where we’ll use the power of the language model to solve them.

Once the setup is complete, the Qwen model can be harnessed for solving math problems. By employing the Bailian Qwen-max model and setting appropriate system and user prompts, the model is primed to understand and address the mathematical queries.

Use prompts to find the corresponding questions in the test paper as required, such as the questions in the PDF image above.

completion = client.chat.completions.create(
    model="qwen-max",
    messages=[
        {'role': 'system', 'content': 'You are a text analyst. I will give you a test paper, and you need to return the text information of the specific question according to the question selected by the user. Please note that only the text information of the question asked by the user should be returned, and nothing else should be returned. Test paper: %s' % (result.data, )},
        {'role': 'user', 'content': 'Please help me find the question 6.'}],
    )

max_result = completion.model_dump_json()

The json module plays a significant role in serializing the output of the Qwen-max model and extracting the specific math problems from the user prompts.

import json

title_text = json.loads(max_result)['choices'][0]['message']['content']
title_text

Code running result:

Query the qwen2.5-math-72b-instruct model

With the model configured, we’ll then query the qwen2.5-math-72b-instruct model using the math problems we’ve isolated. This interaction will generate the solutions we’re seeking, demonstrating the power of this integrated approach in handling complex math problems from PDF test papers.

Subsequently, the OpenAI object is used to query the qwen2.5 - math - 72b - instruct model with the extracted math problems, and the response is serialized using json to obtain the final, accurate answers.

completion = client.chat.completions.create(
    model="qwen2.5-math-72b-instruct",
    messages=[
            {'role': 'system', 'content': 'You are a math teacher. Please conduct step-by-step reasoning and use {} to represent the final answer. Return the text in Markdown format.'},
        {'role': 'user', 'content': 'Please solve the question:%s' % (title_text, )}],
    )

math_result = completion.model_dump_json()
result = json.loads(math_result)['choices'][0]['message']['content']
print(result)

Code running result:

Let’s also try asking a question over another piece of the MATHEMATICS CONTEST

The results after running the code following the same steps as before:

Stay tuned as we work through these steps and see the amazing results that unfold when we combine Undatas.io with a powerful large language model for math problem-solving!

I hope you find this exploration as exciting as I do, and I look forward to sharing more insights and findings with you in the future.

📖See Also

Subscribe to Our Newsletter

Get the latest updates and exclusive content delivered straight to your inbox