README.txt 20.4 KB
Newer Older
1

2 3 4
------------------------------------------------
HTTP PARALLEL REQUEST & THREADING LIBRARY MODULE
------------------------------------------------
5 6 7 8 9 10 11 12 13 14


CONTENTS OF THIS FILE
---------------------

 * About HTTPRL
 * Requirements
 * Configuration
 * API Overview
 * Technical Details
15
 * Code Examples
16 17 18 19 20 21 22 23 24 25 26 27 28


ABOUT HTTPRL
------------

http://drupal.org/project/httprl

HTTPRL is a flexible and powerful HTTP client implementation. Correctly handles
GET, POST, PUT or any other HTTP requests & the sending of data. Issue blocking
or non-blocking requests in parallel. Set timeouts, max simultaneous connection
limits, chunk size, and max redirects to follow. Can handle data with
content-encoding and transfer-encoding headers set. Correctly follows
redirects. Option to forward the referrer when a redirect is found. Cookie
29
extraction and parsing into key value pairs. Can multipart encode data so files
30 31
can easily be sent in a HTTP request. Will emulate a range request if the server
does not support range requests.
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56


REQUIREMENTS
------------

Requires PHP 5. The following functions must be available on the server:
 * stream_socket_client
 * stream_select
 * stream_set_blocking
 * stream_get_meta_data
 * stream_socket_get_name
Some hosting providers disable these functions; but they do come standard with
PHP 5.


CONFIGURATION
-------------

Settings page is located at:
6.x: admin/settings/httprl
7.x: admin/config/development/httprl

 * IP Address to send all self server requests to. If left blank it will use the
   same server as the request. If set to -1 it will use the host name instead of
   an IP address. This controls the output of httprl_build_url_self().
57 58 59 60
 * Enable background callbacks. If disabled all background_callback keys will
   be turned into callback & httprl_queue_background_callback will return NULL
   and not queue up the request. Note that background callbacks will
   automatically be disabled if the site is in maintenance mode.
61 62 63 64 65 66 67 68 69 70 71 72 73


API OVERVIEW
------------

Issue HTTP Requests:
httprl_build_url_self()
 - Helper function to build an URL for asynchronous requests to self.
httprl_request()
 - Queue up a HTTP request in httprl_send_request().
httprl_send_request()
 - Perform many HTTP requests.

74 75 76 77
Create and use a thread:
httprl_queue_background_callback()
 - Queue a special HTTP request (used for threading) in httprl_send_request().

78
Other Functions:
79 80
httprl_is_background_callback_capable()
 - See if httprl can issue a background callback.
81 82 83 84 85 86 87 88
httprl_background_processing()
 - Output text, close connection, continue processing in the background.
httprl_strlen()
 - Get the length of a string in bytes.
httprl_glue_url()
 - Alt to http_build_url().
httprl_get_server_schema()
 - Return the server schema (http or https).
89 90 91 92
httprl_pr()
 - Pretty print data.
httprl_fast403()
 - Issue a 403 and exit.
93

94

95 96 97 98 99 100 101 102 103
TECHNICAL DETAILS
-----------------

Using stream_select() HTTPRL will send http requests out in parallel. These
requests can be made in a blocking or non-blocking way. Blocking will wait for
the http response; Non-Blocking will close the connection not waiting for the
response back. The API for httprl is similar to the Drupal 7 version of
drupal_http_request().

104 105 106
HTTPRL can be used independent of drupal. For basic operations it doesn't
require any built in drupal functions.

107 108 109 110

CODE EXAMPLES
-------------

111 112
**Simple HTTP**

113 114 115 116 117 118 119 120 121 122 123 124 125
Request http://drupal.org/.

    <?php
    // Queue up the request.
    httprl_request('http://drupal.org/');
    // Execute request.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
    ?>


126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
Request http://drupal.org/robots.txt and save it to tmp folder.

    <?php
    // Queue up the request.
    httprl_request('http://drupal.org/robots.txt');
    // Execute request.
    $request = httprl_send_request();

    // Save file if we got a 200 back.
    if ($request['http://drupal.org/robots.txt']->code == 200) {
      file_put_contents('/tmp/robots.txt', $request['http://drupal.org/robots.txt']->data);
    }
    ?>


141 142 143 144 145
Request this servers own front page & the node page.

    <?php
    // Build URL to point to front page of this server.
    $url_front = httprl_build_url_self();
146
    // Build URL to point to /node on this server.
147 148 149 150 151 152 153 154 155 156 157 158
    $url_node = httprl_build_url_self('node');
    // Queue up the requests.
    httprl_request($url_front);
    httprl_request($url_node);
    // Execute requests.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
    ?>


159 160
**Non Blocking HTTP Operations**

161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183
Request 10 URLs in a non blocking manner on this server. Checkout watchdog as
this should generate 10 404s and the $request object won't contain much info.

    <?php
    // Set the blocking mode.
    $options = array(
      'blocking' => FALSE,
    );
    // Queue up the requests.
    $max = 10;
    for ($i=1; $i <= $max; $i++) {
      // Build URL to a page that doesn't exist.
      $url = httprl_build_url_self('asdf-asdf-asdf-' . $i);
      httprl_request($url, $options);
    }
    // Execute requests.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
    ?>


184 185
Request 10 URLs in a non blocking manner with one httprl_request() call. These
URLs will all have the same options.
186 187

    <?php
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
    // Set the blocking mode.
    $options = array(
      'method' => 'HEAD',
      'blocking' => FALSE,
    );
    // Queue up the requests.
    $max = 10;
    $urls = array();
    for ($i=1; $i <= $max; $i++) {
      // Build URL to a page that doesn't exist.
      $urls[] = httprl_build_url_self('asdf-asdf-asdf-' . $i);
    }
    // Queue up the requests.
    httprl_request($urls, $options);
    // Execute requests.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
    ?>


Request 1000 URLs in a non blocking manner with one httprl_request() call. These
URLs will all have the same options. This will saturate the server and any
connections that couldn't be made will be dropped.

    <?php
    // Set the blocking mode.
    $options = array(
      'method' => 'HEAD',
      'blocking' => FALSE,
219 220
      'domain_connections' => 1000,
      'global_connections' => 1000,
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239
    );
    // Queue up the requests.
    $max = 1000;
    $urls = array();
    for ($i=1; $i <= $max; $i++) {
      // Build URL to a page that doesn't exist.
      $urls[] = httprl_build_url_self('asdf-asdf-asdf-' . $i);
    }
    // Queue up the requests.
    httprl_request($urls, $options);
    // Execute requests.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
    ?>


Request 1000 URLs in a non blocking manner with one httprl_request() call. These
240 241 242
URLs will all have the same options. This will saturate the server. Usually all
1000 requests will eventually hit the server due to it waiting for the
connection to be established; `async_connect` is FALSE.
243 244 245 246 247 248 249

    <?php
    // Set the blocking mode.
    $options = array(
      'method' => 'HEAD',
      'blocking' => FALSE,
      'async_connect' => FALSE,
250 251 252 253
      // domain_connections must be smaller than the servers max number of
      // clients.
      'domain_connections' => 32,
      'global_connections' => 1000,
254 255 256 257 258 259 260 261 262 263 264 265 266 267 268
    );
    // Queue up the requests.
    $max = 1000;
    $urls = array();
    for ($i=1; $i <= $max; $i++) {
      // Build URL to a page that doesn't exist.
      $urls[] = httprl_build_url_self('asdf-asdf-asdf-' . $i);
    }
    // Queue up the requests.
    httprl_request($urls, $options);
    // Execute requests.
    $request = httprl_send_request();

    // Echo out the results.
    echo httprl_pr($request);
269 270 271
    ?>


272 273
**HTTP Operations and Callbacks**

274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365
Use a callback in the event loop to do processing on the request. In this case
we are going to use httprl_pr() as the callback function.

    <?php
    // Setup return variable.
    $x = '';
    // Setup options array.
    $options = array(
      'method' => 'HEAD',
      'callback' => array(
        array(
          'function' => 'httprl_pr',
          'return' => &$x,
        ),
      ),
    );
    // Build URL to point to front page of this server.
    $url_front = httprl_build_url_self();
    // Queue up the request.
    httprl_request($url_front, $options);
    // Execute request.
    $request = httprl_send_request();

    // Echo returned value from function callback.
    echo $x;
    ?>


Use a background callback in the event loop to do processing on the request.
In this case we are going to use httprl_pr() as the callback function. A
background callback creates a new thread to run this function in.

    <?php
    // Setup return variable.
    $x = '';
    // Setup options array.
    $options = array(
      'method' => 'HEAD',
      'background_callback' => array(
        array(
          'function' => 'httprl_pr',
          'return' => &$x,
        ),
      ),
    );
    // Build URL to point to front page of this server.
    $url_front = httprl_build_url_self();
    // Queue up the request.
    httprl_request($url_front, $options);
    // Execute request.
    $request = httprl_send_request();

    // Echo returned value from function callback.
    echo $x;
    ?>


Use a background callback in the event loop to do processing on the request.
In this case we are going to use print_r() as the callback function. A
background callback creates a new thread to run this function in. The first
argument passed in is the request object, the FALSE tells print_r to echo out
instead of returning a value.

    <?php
    // Setup return & print variable.
    $x = '';
    $y = '';
    // Setup options array.
    $options = array(
      'method' => 'HEAD',
      'background_callback' => array(
        array(
          'function' => 'print_r',
          'return' => &$x,
          'printed' => &$y,
        ),
        FALSE,
      ),
    );
    // Build URL to point to front page of this server.
    $url_front = httprl_build_url_self();
    // Queue up the request.
    httprl_request($url_front, $options);
    // Execute request.
    $request = httprl_send_request();

    // Echo what was returned and printed from function callback.
    echo $x . "<br />\n";
    echo $y;
    ?>


366 367 368
**More Advanced HTTP Operations**

Hit 4 different URLs, Using at least 2 that has a status code of 200 and
369 370
erroring out the others that didn't return fast. Using the Range header so only
the first and last 128 bytes are returned.
371 372 373 374 375 376 377 378 379 380 381 382 383 384

    <?php
    // Array of URLs to get.
    $urls = array(
      'http://google.com/',
      'http://bing.com/',
      'http://yahoo.com/',
      'http://www.duckduckgo.com/',
      'http://www.drupal.org/',
    );

    // Process list of URLs.
    $options = array(
      'alter_all_streams_function' => 'need_two_good_results',
385
      'headers' => array('Range' => 'bytes=0-127,-128'),
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455
    );
    // Queue up the requests.
    httprl_request($urls, $options);

    // Execute requests.
    $requests = httprl_send_request();

    // Print what was done.
    echo httprl_pr($requests);

    function need_two_good_results($id, &$responses) {
      static $counter = 0;
      foreach ($responses as $id => &$result) {
        // Skip if we got a 200.
        if ($result->code == 200) {
          $counter += 1;
          continue;
        }
        if ($result->status == 'Done.') {
          continue;
        }

        if ($counter >= 2) {
          // Set the code to request was aborted.
          $result->code = HTTPRL_REQUEST_ABORTED;
          $result->error = 'Software caused connection abort.';
          // Set status to done and set timeout.
          $result->status = 'Done.';
          $result->options['timeout'] -= $result->running_time;

          // Close the file pointer and remove from the stream from the array.
          fclose($result->fp);
          unset($result->fp);
        }
      }
    }
    ?>


Send 2 files in one field via a POST request.

    <?php
    // Send request to front page.
    $url_front = httprl_build_url_self();
    // Set options.
    $options = array(
      'method' => 'POST',
      'data' => array(
        'x' => 1,
        'y' => 2,
        'z' => 3,
        'files' => array(
          'core_js' => array(
            'misc/form.js',
            'misc/batch.js',
          ),
        ),
      ),
    );
    // Queue up the request.
    httprl_request($url_front, $options);
    // Execute request.
    $request = httprl_send_request();
    // Echo what was returned.
    echo httprl_pr($request);
    ?>


**Threading Examples**

456 457 458
Use 2 threads to load up 4 different nodes.

    <?php
459 460 461 462 463
    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
    // List of nodes to load; 241-244.
    $nodes = array(241 => '', 242 => '', 243 => '', 244 => '');
    foreach ($nodes as $nid => &$node) {
      // Setup callback options array.
      $callback_options = array(
        array(
          'function' => 'node_load',
          'return' => &$node,
          // Setup options array.
          'options' => array(
            'domain_connections' => 2, // Only use 2 threads for this request.
          ),
        ),
        $nid,
      );
      // Queue up the request.
      httprl_queue_background_callback($callback_options);
    }
    // Execute request.
    httprl_send_request();

    // Echo what was returned.
    echo httprl_pr($nodes);
    ?>

489 490 491 492 493

Run a function in the background. Notice that there is no return or printed key
in the callback options.

    <?php
494 495 496 497 498
    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

499 500 501 502 503 504 505 506 507 508 509 510 511 512
    // Setup callback options array; call watchdog in the background.
    $callback_options = array(
      array(
        'function' => 'watchdog',
      ),
      'httprl-test', 'background watchdog call done', array(), WATCHDOG_DEBUG,
    );
    // Queue up the request.
    httprl_queue_background_callback($callback_options);

    // Execute request.
    httprl_send_request();
    ?>

513 514 515 516 517 518 519 520 521 522 523 524

Pass by reference example. Example is D7 only; pass by reference works in
D6 & D7.

    <?php
    // Code from system_rebuild_module_data().
    $modules = _system_rebuild_module_data();
    ksort($modules);

    // Show first module before running system_get_files_database().
    echo httprl_pr(current($modules));

525 526 527 528 529
    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545
    $callback_options = array(
      array(
        'function' => 'system_get_files_database',
        'return' => '',
      ),
      &$modules, 'module'
    );
    httprl_queue_background_callback($callback_options);

    // Execute requests.
    httprl_send_request();

    // Show first module after running system_get_files_database().
    echo httprl_pr(current($modules));
    ?>

546

547 548 549 550 551 552 553 554 555 556
Get 2 results from 2 different queries at the hook_boot bootstrap level in D6.

    <?php
    // Run 2 queries and get the result.
    $x = db_result(db_query_range("SELECT filename FROM {system} ORDER BY filename ASC", 0, 1));
    $y = db_result(db_query_range("SELECT filename FROM {system} ORDER BY filename DESC", 0, 1));
    echo $x . "<br \>\n" . $y . "<br \>\n";
    unset($x, $y);


557 558 559 560 561
    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579
    // Run above 2 queries and get the result via a background callback.
    $args = array(
      // First query.
      array(
        'type' => 'function',
        'call' => 'db_query_range',
        'args' => array('SELECT filename FROM {system} ORDER BY filename ASC', 0, 1),
      ),
      array(
        'type' => 'function',
        'call' => 'db_result',
        'args' => array('last' => NULL),
        'return' => &$x,
      ),

      // Second Query.
      array(
        'type' => 'function',
580
        'call' => 'db_query_range',
581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618
        'args' => array('SELECT filename FROM {system} ORDER BY filename DESC', 0, 1),
      ),
      array(
        'type' => 'function',
        'call' => 'db_result',
        'args' => array('last' => NULL),
        'return' => &$y,
      ),
    );
    $callback_options = array(array('return' => ''), &$args);
    // Queue up the request.
    httprl_queue_background_callback($callback_options);
    // Execute request.
    httprl_send_request();

    // Echo what was returned.
    echo httprl_pr($x, $y);
    ?>


Get 2 results from 2 different queries at the hook_boot bootstrap level in D7.

    <?php
    $x = db_select('system', 's')
      ->fields('s', array('filename'))
      ->orderBy('filename', 'ASC')
      ->range(0, 1)
      ->execute()
      ->fetchField();
    $y = db_select('system', 's')
      ->fields('s', array('filename'))
      ->orderBy('filename', 'DESC')
      ->range(0, 1)
      ->execute()
      ->fetchField();
    echo $x . "<br \>\n" . $y . "<br \>\n";
    unset($x, $y);

619 620 621 622 623 624

    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713
    // Run above 2 queries and get the result via a background callback.
    $args = array(
      // First query.
      array(
        'type' => 'function',
        'call' => 'db_select',
        'args' => array('system', 's',),
      ),
      array(
        'type' => 'method',
        'call' => 'fields',
        'args' => array('s', array('filename')),
      ),
      array(
        'type' => 'method',
        'call' => 'orderBy',
        'args' => array('filename', 'ASC'),
      ),
      array(
        'type' => 'method',
        'call' => 'range',
        'args' => array(0, 1),
      ),
      array(
        'type' => 'method',
        'call' => 'execute',
        'args' => array(),
      ),
      array(
        'type' => 'method',
        'call' => 'fetchField',
        'args' => array(),
        'return' => &$x,
      ),

      // Second Query.
      array(
        'type' => 'function',
        'call' => 'db_select',
        'args' => array('system', 's',),
      ),
      array(
        'type' => 'method',
        'call' => 'fields',
        'args' => array('s', array('filename')),
      ),
      array(
        'type' => 'method',
        'call' => 'orderBy',
        'args' => array('filename', 'DESC'),
      ),
      array(
        'type' => 'method',
        'call' => 'range',
        'args' => array(0, 1),
      ),
      array(
        'type' => 'method',
        'call' => 'execute',
        'args' => array(),
      ),
      array(
        'type' => 'method',
        'call' => 'fetchField',
        'args' => array(),
        'return' => &$y,
      ),
    );
    $callback_options = array(array('return' => ''), &$args);
    // Queue up the request.
    httprl_queue_background_callback($callback_options);
    // Execute request.
    httprl_send_request();

    // Echo what was returned.
    echo httprl_pr($x, $y);
    ?>


Run a cache clear at the DRUPAL_BOOTSTRAP_FULL level as the current user in a
non blocking background request.

    <?php
    // Normal way to do this.
    drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
    module_load_include('inc', 'system', 'system.admin');
    system_clear_cache_submit();


714 715 716 717 718
    // Bail out here if background callbacks are disabled.
    if (!httprl_is_background_callback_capable()) {
      return FALSE;
    }

719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756
    // How to do it in a non blocking background request.
    $args = array(
      array(
        'type' => 'function',
        'call' => 'drupal_bootstrap',
        'args' => array(DRUPAL_BOOTSTRAP_FULL),
      ),
      array(
        'type' => 'function',
        'call' => 'module_load_include',
        'args' => array('inc', 'system', 'system.admin'),
      ),
      array(
        'type' => 'function',
        'call' => 'system_clear_cache_submit',
        'args' => array('', ''),
      ),
      array(
        'type' => 'function',
        'call' => 'watchdog',
        'args' => array('httprl-test', 'background cache clear done', array(), WATCHDOG_DEBUG),
      ),
    );

    // Pass the current session to the sub request.
    if (!empty($_COOKIE[session_name()])) {
      $options = array('headers' => array('Cookie' => session_name() . '=' . $_COOKIE[session_name()] . ';'));
    }
    else {
      $options = array();
    }
    $callback_options = array(array('options' => $options), &$args);

    // Queue up the request.
    httprl_queue_background_callback($callback_options);
    // Execute request.
    httprl_send_request();
    ?>
757 758


759 760
print 'My Text'; cut the connection by sending the data over the wire and do
processing in the background.
761

762
    <?php
763 764 765 766
    httprl_background_processing('My Text');
    // Everything after this point does not affect page load time.
    sleep(5);
    echo 'You should not see this text';
767
    ?>