Sunday Coffee & Code: Last week the RFP Marketplace PoC worked. This weeks was all about improving outputs (prompt improvements and testing Gemma4)

Last week’s dry run proved the PoC worked. A buyer could issue an RFP, a supplier could upload their company knowledge, the responder agent could generate a grounded draft response, and the assessor agent could evaluate that response against weighted criteria with rationale and evidence.

𝗦𝗼 𝘁𝗵𝗶𝘀 𝘄𝗲𝗲𝗸𝗲𝗻𝗱 𝗜 𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗼𝗻 𝗿𝗲𝘀𝗽𝗼𝗻𝘀𝗲 𝗾𝘂𝗮𝗹𝗶𝘁𝘆.

𝘛𝘩𝘦 𝘧𝘪𝘳𝘴𝘵 𝘵𝘩𝘪𝘯𝘨 𝘐 𝘤𝘩𝘢𝘯𝘨𝘦𝘥 𝘸𝘢𝘴 𝘵𝘩𝘦 𝘱𝘳𝘰𝘮𝘱𝘵. The original prompt was a useful starting point, but it was too broad. It produced something plausible, but not something I would want to hand to a proposal team as a first draft. I reworked it to make the responder agent pay much closer attention to the specific RFP requirement, use the supporting RFP information properly, and produce something closer to a proposal section than a general capability statement (𝘢𝘭𝘭 𝘣𝘢𝘴𝘦𝘥 𝘰𝘯 𝘮𝘺 𝘴𝘵𝘢𝘯𝘥𝘢𝘳𝘥 𝘱𝘳𝘰𝘮𝘱𝘵𝘪𝘯𝘨 𝘵𝘦𝘮𝘱𝘭𝘢𝘵𝘦).

That alone helped. The improved prompt produced a clearer and more focused response in about 52 seconds.

𝘛𝘩𝘦𝘯 𝘐 𝘸𝘢𝘯𝘵𝘦𝘥 𝘵𝘰 𝘵𝘦𝘴𝘵 𝘵𝘩𝘦 𝘦𝘧𝘧𝘦𝘤𝘵 𝘰𝘧 𝘢 𝘮𝘰𝘥𝘦𝘭 𝘤𝘩𝘢𝘯𝘨𝘦 𝘮𝘰𝘥𝘦𝘭. I updated Ollama and moved from llama3.1:8b to Google gemma4:26b.

The full sample prompt is over 36,000 characters, not that big. The gemma4 model just spun. A simple Harry Potter prompt worked fine, so the model was not completely broken. Testing via 𝘰𝘭𝘭𝘢𝘮𝘢 𝘳𝘶𝘯, it just dropped back to the prompt. Tested a smaller prompt which worked and I̲ ̲d̲o̲n̲’̲t̲ ̲k̲n̲o̲w̲ ̲w̲h̲y̲ ̲I̲ ̲d̲i̲d̲n̲’̲t̲ ̲c̲l̲u̲e̲ ̲i̲n̲ ̲o̲n̲ ̲t̲h̲i̲s̲ ̲-̲ ̲I̲ ̲h̲a̲v̲e̲ ̲c̲o̲m̲e̲ ̲a̲c̲r̲o̲s̲s̲ ̲t̲h̲e̲ ̲p̲r̲o̲b̲l̲e̲m̲ ̲b̲e̲f̲o̲r̲e̲.̲

So I watched the Ollama logs, 𝘫𝘰𝘶𝘳𝘯𝘢𝘭𝘤𝘵𝘭 -𝘧 -𝘶 𝘰𝘭𝘭𝘢𝘮𝘢 and there it was, 𝘵𝘳𝘶𝘯𝘤𝘢𝘵𝘪𝘯𝘨 𝘪𝘯𝘱𝘶𝘵 𝘱𝘳𝘰𝘮𝘱𝘵 𝘭𝘪𝘮𝘪𝘵=4096 - the 𝘯𝘶𝘮_𝘤𝘵𝘹 problem. Quick fix to the API code, a re-test and all was fine.

The response from the gemma4:26b was much better, followed the requirement more closely, used the RFP context more effectively, and read much more like something a human could refine rather than rewrite. (I also like how the Gemma4 model shows it’s reasoning - something I have captured in the past as a form of 𝘳𝘦𝘢𝘴𝘰𝘯𝘪𝘯𝘨 𝘢𝘶𝘥𝘪𝘵 𝘵𝘳𝘢𝘪𝘭).

Still a POC. Still rough around the edges but 𝘂𝘀𝗶𝗻𝗴 𝗮 𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗹𝘆 𝗼𝗳𝗳𝗹𝗶𝗻𝗲 𝗯𝘂𝗶𝗹𝗱, 𝗮 𝗰𝗼𝗿𝗿𝗲𝗰𝘁𝗹𝘆 𝗰𝗼𝗻𝗳𝗶𝗴𝘂𝗿𝗲𝗱 𝗺𝗼𝗱𝗲𝗹, 𝗮𝗿𝗼𝘂𝗻𝗱 𝗮𝗻 𝗥𝗙𝗣 𝗺𝗮𝗿𝗸𝗲𝘁𝗽𝗹𝗮𝗰𝗲 𝘁𝗼 𝗯𝗿𝗶𝗻𝗴 𝗯𝘂𝘆𝗲𝗿𝘀 𝗮𝗻𝗱 𝘀𝗲𝗹𝗹𝗲𝗿𝘀 𝘁𝗼𝗴𝗲𝘁𝗵𝗲𝗿 𝘁𝗼 𝗿𝗲𝗱𝘂𝗰𝗲 𝗳𝗿𝗶𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗼𝗽𝘁𝗶𝗺𝗶𝘀𝗲 𝘁𝗵𝗲 𝗲𝗻𝗱 𝘁𝗼 𝗲𝗻𝗱 𝗽𝗿𝗼𝗰𝗲𝘀𝘀 𝗹𝗼𝗼𝗸𝘀 𝘁𝗼 𝗺𝗲 𝘁𝗼 𝗯𝗲 𝗮 𝗿𝗲𝗮𝗹𝗶𝘁𝘆.

Sunday Coffee & Code: Last week the RFP Marketplace PoC worked. This weeks was all about improving outputs (prompt improvements and testing Gemma4)

Want to Discuss This Topic?