RAG Module API Reference¶
Auto-generated from Python docstrings
LLM Factory¶
green_gov_rag.rag.llm_factory ¶
LLM Provider Factory for multi-platform support.
Supports OpenAI, Azure OpenAI, AWS Bedrock, and Anthropic. Uses LangChain for abstraction across providers.
LLMFactory ¶
Factory for creating LLM instances based on provider configuration.
Source code in green_gov_rag/rag/llm_factory.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
create_llm staticmethod ¶
create_llm(provider: str | None = None, model: str | None = None, temperature: float = 0.2, max_tokens: int = 500) -> BaseLanguageModel
Create an LLM instance based on the provider.
provider: LLM provider (openai, azure, bedrock, anthropic).
Defaults to settings.llm_provider
model: Model name. Defaults to settings.llm_model
temperature: Sampling temperature
max_tokens: Maximum tokens in response
LangChain BaseLanguageModel instance
ValueError: If provider is not supported or required credentials are missing
Source code in green_gov_rag/rag/llm_factory.py
get_llm ¶
get_llm(provider: str | None = None, model: str | None = None, temperature: float = 0.2, max_tokens: int = 500) -> BaseLanguageModel
Convenience function to get an LLM instance.
provider: LLM provider (openai, azure, bedrock, anthropic)
model: Model name
temperature: Sampling temperature
max_tokens: Maximum tokens in response
LangChain BaseLanguageModel instance
Source code in green_gov_rag/rag/llm_factory.py
Vector Store Factory¶
green_gov_rag.rag.vector_store_factory ¶
Factory for creating vector store instances.
VectorStoreFactory ¶
Factory for creating vector store instances based on configuration.
Source code in green_gov_rag/rag/vector_store_factory.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 | |
create_vector_store staticmethod ¶
create_vector_store(embeddings: Embeddings, store_type: str | None = None, **kwargs) -> VectorStoreInterface
Create a vector store instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embeddings | Embeddings | Embeddings model to use | required |
store_type | str | None | Type of store ('faiss', 'qdrant', 'chromadb'). If None, uses settings.vector_store_type | None |
**kwargs | Additional arguments for specific store implementations | {} |
Returns:
| Name | Type | Description |
|---|---|---|
VectorStoreInterface | VectorStoreInterface | Initialized vector store |
Raises:
| Type | Description |
|---|---|
ValueError | If store_type is not supported |
Examples:
>>> from green_gov_rag.rag.embeddings import ChunkEmbedder
>>> embeddings = ChunkEmbedder().embedder
>>> store = VectorStoreFactory.create_vector_store(embeddings)
>>> # Explicitly choose Qdrant
>>> store = VectorStoreFactory.create_vector_store(
... embeddings,
... store_type='qdrant',
... url='http://localhost:6333'
... )
Source code in green_gov_rag/rag/vector_store_factory.py
get_available_stores staticmethod ¶
Get list of available vector store types.
Returns:
| Type | Description |
|---|---|
list[str] | List of supported store types |
Source code in green_gov_rag/rag/vector_store_factory.py
validate_config staticmethod ¶
Validate configuration for a vector store type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
store_type | str | None | Type to validate, or None for current config | None |
Returns:
| Type | Description |
|---|---|
dict | Dictionary with validation results |
Examples:
>>> VectorStoreFactory.validate_config('qdrant')
{
'valid': True,
'store_type': 'qdrant',
'issues': [],
'config': {'url': 'http://localhost:6333', ...}
}
Source code in green_gov_rag/rag/vector_store_factory.py
create_vector_store ¶
create_vector_store(embeddings: Embeddings, store_type: str | None = None, **kwargs) -> VectorStoreInterface
Create a vector store instance.
Convenience wrapper around VectorStoreFactory.create_vector_store()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embeddings | Embeddings | Embeddings model | required |
store_type | str | None | Type of store (faiss, qdrant, chromadb) | None |
**kwargs | Additional store-specific arguments | {} |
Returns:
| Name | Type | Description |
|---|---|---|
VectorStoreInterface | VectorStoreInterface | Initialized vector store |
Source code in green_gov_rag/rag/vector_store_factory.py
Embeddings¶
green_gov_rag.rag.embeddings ¶
Embeddings module.
Generate vector embeddings for document chunks using either AWS Bedrock LLM or HuggingFace embedding models.
- Supports dual embedding providers:
- HuggingFace (sentence-transformers)
- AWS Bedrock (via OpenAI-compatible API)
- Takes chunk dicts with content + metadata.
- Returns dicts with embedding included.
- Easily integrated into your ETL pipeline after chunker.py.
Now uses centralized settings from green_gov_rag.config
ChunkEmbedder ¶
Source code in green_gov_rag/rag/embeddings.py
__init__ ¶
Initialize embedding generator.
:param provider: "bedrock" or "huggingface" :param model_name: Name of the model to use.
Source code in green_gov_rag/rag/embeddings.py
embed_chunks ¶
Generate embeddings for a list of chunk dictionaries using batching.
:param chunks: List of dicts with at least {"content": str, "metadata": dict} :param batch_size: Number of chunks to embed per batch (default: 100) :param show_progress: Show progress information (default: True) :return: List of dicts with {"content", "metadata", "embedding"}
Source code in green_gov_rag/rag/embeddings.py
Enhanced Response¶
green_gov_rag.rag.enhanced_response ¶
Enhanced Response Generator with Citations and Deep Links.
This module provides advanced RAG response formatting with: 1. Inline citations with source numbers [1], [2], etc. 2. Deep links to specific PDF pages/sections 3. Hierarchical section path display (e.g., "Section 2.1.3") 4. Source attribution with document metadata 5. Confidence scoring for cited passages
Citation ¶
A citation linking answer text to source document.
Source code in green_gov_rag/rag/enhanced_response.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | |
__init__ ¶
Initialize citation.
source_number: Citation number (1, 2, 3, etc.)
document: Source Document object
text_snippet: Text excerpt that was cited
confidence: Confidence score for this citation (0-1)
Source code in green_gov_rag/rag/enhanced_response.py
get_deep_link ¶
Generate deep link to specific page/section in PDF.
Returns¶
URL with fragment identifier for PDF page
Source code in green_gov_rag/rag/enhanced_response.py
get_section_path ¶
Get hierarchical section path (e.g., 'Section 2.1.3').
Returns¶
Formatted section path string
Source code in green_gov_rag/rag/enhanced_response.py
format_citation_markdown ¶
Format citation as markdown with link.
Returns¶
Markdown-formatted citation string
Source code in green_gov_rag/rag/enhanced_response.py
to_dict ¶
Convert citation to dictionary.
Returns¶
Dict representation of citation
Source code in green_gov_rag/rag/enhanced_response.py
EnhancedResponse ¶
Enhanced RAG response with inline citations and source attribution.
Source code in green_gov_rag/rag/enhanced_response.py
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 | |
__init__ ¶
Initialize enhanced response.
answer: Generated answer text
sources: List of source Documents used
query: Original user query
Source code in green_gov_rag/rag/enhanced_response.py
format_answer_with_inline_citations ¶
Format answer with inline citation markers.
Returns¶
Answer text with inline [1], [2], etc. citations
Source code in green_gov_rag/rag/enhanced_response.py
format_sources_markdown ¶
Format sources as markdown list with deep links.
Returns¶
Markdown-formatted sources section
Source code in green_gov_rag/rag/enhanced_response.py
format_full_response_markdown ¶
Format complete response with answer and sources.
Returns¶
Complete markdown response
Source code in green_gov_rag/rag/enhanced_response.py
to_dict ¶
Convert response to dictionary format.
Returns¶
Dict representation for API/JSON responses
Source code in green_gov_rag/rag/enhanced_response.py
ResponseFormatter ¶
Utility class for formatting RAG responses with citations.
Source code in green_gov_rag/rag/enhanced_response.py
258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 | |
create_enhanced_response staticmethod ¶
Create an enhanced response with citations.
query: User query
answer: Generated answer
sources: Source documents
EnhancedResponse object
Source code in green_gov_rag/rag/enhanced_response.py
format_with_hierarchical_context staticmethod ¶
Format sources with hierarchical section context.
sources: List of source documents
List of formatted source dictionaries
Source code in green_gov_rag/rag/enhanced_response.py
Hybrid Search¶
green_gov_rag.rag.hybrid_search ¶
Hybrid Geospatial Search for GreenGovRAG.
Combines vector similarity search, spatial filtering, and metadata filtering following the Elasticsearch/Bedrock geospatial RAG pattern.
Key Features: 1. Vector similarity search (semantic search) 2. Spatial filtering by LGA codes, state, or coordinates 3. Metadata filtering (jurisdiction, topic, ESG scope) 4. Hierarchical spatial filtering (federal → state → local) 5. Re-ranking by relevance
SpatialQuery dataclass ¶
Spatial query parameters extracted from user input.
Source code in green_gov_rag/rag/hybrid_search.py
HybridGeospatialSearch ¶
Combine lexical, spatial, and vector search for geospatial RAG.
Source code in green_gov_rag/rag/hybrid_search.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 | |
__init__ ¶
Initialize hybrid search with vector store.
vector_store: VectorStore instance for similarity search
enable_ner: Whether to enable NER for automatic location extraction
Source code in green_gov_rag/rag/hybrid_search.py
search ¶
search(query: str, spatial_query: Optional[SpatialQuery] = None, metadata_filters: Optional[dict] = None, k: int = 10, enable_query_expansion: bool = True) -> list[Document]
Hybrid search combining vector, spatial, and metadata filtering.
query: User query string
spatial_query: Optional SpatialQuery for location-based filtering
metadata_filters: Optional dict for metadata filtering
k: Number of initial results to retrieve (before filtering)
enable_query_expansion: Whether to expand acronyms in query (default: True)
List of Document objects ranked by relevance
Source code in green_gov_rag/rag/hybrid_search.py
search_with_lga ¶
search_with_lga(query: str, lga_name: str, lga_code: str, state: str, k: int = 10) -> list[Document]
Convenience method for LGA-based search.
query: User query string
lga_name: Name of the LGA (e.g., "City of Adelaide")
lga_code: ABS LGA code (e.g., "40070")
state: State code (e.g., "SA")
k: Number of results to return
List of Document objects relevant to the LGA
Source code in green_gov_rag/rag/hybrid_search.py
search_with_esg_filters ¶
search_with_esg_filters(query: str, emission_scopes: list[str] | None = None, frameworks: list[str] | None = None, greenhouse_gases: list[str] | None = None, consolidation_method: str | None = None, methodology_type: str | None = None, scope_3_categories: list[str] | None = None, regulator: str | None = None, activity_types: list[str] | None = None, industry_codes: list[str] | None = None, k: int = 10) -> list[Document]
Convenience method for ESG-filtered search.
query: User query string
emission_scopes: List of emission scopes (e.g., ["scope_1", "scope_2"])
frameworks: List of frameworks (e.g., ["NGER", "ISSB", "GHG_Protocol"])
greenhouse_gases: List of gases (e.g., ["CO2", "CH4", "N2O", "SF6", "HFCs", "PFCs", "NF3"])
consolidation_method: Consolidation approach (e.g., "operational_control", "equity_share", "financial_control")
methodology_type: Methodology type (e.g., "calculation", "reporting", "verification")
scope_3_categories: List of Scope 3 categories (e.g., ["upstream_transport", "business_travel"])
regulator: Regulator name (e.g., "Clean Energy Regulator", "NSW EPA")
activity_types: List of activity types (e.g., ["fuel_combustion", "electricity_consumption"])
industry_codes: List of ANZSIC industry codes (e.g., ["B0600"])
k: Number of results to return
List of Document objects matching ESG criteria
Source code in green_gov_rag/rag/hybrid_search.py
search_with_auto_location ¶
Search with automatic location extraction from query text.
Uses NER to extract LGA codes and states from the query, then performs spatial filtering automatically.
query: User query text (e.g., "What are tree rules in Adelaide?")
k: Number of results to return
List of Document objects matching query and extracted locations
Example:¶
>>> search_with_auto_location("emission rules in Port Adelaide Enfield", k=5)
# Automatically extracts LGA code "40280" and state "SA"
Source code in green_gov_rag/rag/hybrid_search.py
search_by_jurisdiction_and_category ¶
search_by_jurisdiction_and_category(query: str, jurisdiction: str | None = None, category: str | None = None, topic: str | None = None, region: str | None = None, k: int = 10) -> list[Document]
Search filtered by jurisdiction, category, and topic.
query: User query string
jurisdiction: Jurisdiction level (e.g., "federal", "state", "local")
category: Document category (e.g., "environment", "planning", "legislation")
topic: Specific topic (e.g., "emissions_reporting", "biodiversity", "tree_management")
region: Region name (e.g., "South Australia", "New South Wales")
k: Number of results to return
List of Document objects matching criteria
Source code in green_gov_rag/rag/hybrid_search.py
search_nger_compliant ¶
search_nger_compliant(query: str, reportable_under_nger: bool = True, nger_threshold_tonnes: int | None = None, k: int = 10) -> list[Document]
Search for NGER-compliant documents.
query: User query string
reportable_under_nger: Filter for NGER reportability
nger_threshold_tonnes: Filter by NGER threshold (e.g., 25000, 100000)
k: Number of results to return
List of NGER-compliant Document objects
Source code in green_gov_rag/rag/hybrid_search.py
search_scope_3 ¶
search_scope_3(query: str, scope_3_categories: list[str] | None = None, frameworks: list[str] | None = None, include_issb: bool = True, k: int = 10) -> list[Document]
Search for Scope 3 emissions guidance.
query: User query string
scope_3_categories: List of Scope 3 categories to filter by:
- purchased_goods_services (Cat 1)
- capital_goods (Cat 2)
- fuel_energy_activities (Cat 3)
- upstream_transport (Cat 4)
- waste_generated (Cat 5)
- business_travel (Cat 6)
- employee_commuting (Cat 7)
- upstream_leased_assets (Cat 8)
- downstream_transport (Cat 9)
- processing_sold_products (Cat 10)
- use_of_sold_products (Cat 11)
- end_of_life_treatment (Cat 12)
- downstream_leased_assets (Cat 13)
- franchises (Cat 14)
- investments (Cat 15)
frameworks: ESG frameworks (e.g., ["ISSB", "GHG_Protocol", "GRI"])
include_issb: Whether to include ISSB standards (default: True)
k: Number of results to return
List of Scope 3 Document objects
Source code in green_gov_rag/rag/hybrid_search.py
search_scope_3_by_type ¶
Search Scope 3 emissions by upstream or downstream type.
query: User query string
scope_type: Either "upstream" (categories 1-8) or "downstream" (categories 9-15)
k: Number of results to return
List of Scope 3 Document objects filtered by type
Source code in green_gov_rag/rag/hybrid_search.py
advanced_search ¶
advanced_search(query: str, lga_codes: list[str] | None = None, state: str | None = None, jurisdiction: str | None = None, category: str | None = None, topic: str | None = None, emission_scopes: list[str] | None = None, frameworks: list[str] | None = None, greenhouse_gases: list[str] | None = None, regulator: str | None = None, industry_codes: list[str] | None = None, facility_types: list[str] | None = None, k: int = 10) -> list[Document]
Advanced search with multiple filter types.
Combines spatial, metadata, and ESG filters for precise retrieval.
query: User query string
lga_codes: List of LGA codes for spatial filtering
state: State code for spatial filtering
jurisdiction: Jurisdiction level (federal/state/local)
category: Document category
topic: Specific topic
emission_scopes: List of emission scopes
frameworks: List of ESG frameworks
greenhouse_gases: List of greenhouse gases
regulator: Regulator name
industry_codes: List of ANZSIC codes
facility_types: List of facility types
k: Number of results to return
List of filtered and ranked Document objects
Source code in green_gov_rag/rag/hybrid_search.py
632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 | |
Location NER¶
green_gov_rag.rag.location_ner ¶
Named Entity Recognition for Location Extraction.
Extracts Australian locations (LGAs, states, cities) from text queries and maps them to standardized codes for geospatial filtering.
Uses both rule-based matching and LLM-based extraction for robustness.
LocationNER ¶
Extract and normalize Australian locations from text.
Source code in green_gov_rag/rag/location_ner.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 | |
__init__ ¶
Initialize location NER.
use_llm: Whether to use LLM for extraction (more accurate)
llm_model: OpenAI model to use for LLM-based extraction
Source code in green_gov_rag/rag/location_ner.py
extract_locations ¶
Extract locations from text using both rule-based and LLM methods.
text: Query text to extract locations from
Dict with extracted locations:
{
"states": ["SA", "NSW"],
"lgas": [{"name": "Adelaide", "code": "40070", "state": "SA"}],
"raw_locations": ["Adelaide", "South Australia"]
}
Source code in green_gov_rag/rag/location_ner.py
extract_lga_codes ¶
Extract LGA codes from text (convenience method).
text: Query text
List of LGA codes
Source code in green_gov_rag/rag/location_ner.py
extract_state_codes ¶
Extract state codes from text (convenience method).
text: Query text
List of state codes
Source code in green_gov_rag/rag/location_ner.py
add_lga_mapping ¶
Add a new LGA mapping.
name: Common name (e.g., "adelaide")
lga_code: ABS LGA code
state: State code (e.g., "NSW", "VIC")
official_name: Official LGA name (defaults to capitalized name)
Source code in green_gov_rag/rag/location_ner.py
QueryLocationProcessor ¶
Process queries to extract and enrich with location information.
Source code in green_gov_rag/rag/location_ner.py
__init__ ¶
Initialize processor.
ner: LocationNER instance (creates one if not provided)
process_query ¶
Process query and extract location metadata.
query: User query text
Dict with query and location metadata